At second thought it might not be a bug in python-tz, but some undefined behavior that results from the pandas use of tz._utcoffset:
> tz = pytz.timezone('Asia/Tokyo') > dt = datetime.datetime(2011,1,1) > > In[76]: tz.utcoffset(dt) > Out[76]: datetime.timedelta(0, 32400) > > In [77]: tz._utcoffset > Out[77]: datetime.timedelta(0, 33540) > In the first case tz.utcoffset has a reference date, and can select the proper time offset, i.e. in 2011 this is 09:00, but tz._utcoffset doesn't know which year it refers to, and hence, it picks one offset, in this case the first on the list that has the additional 19 minutes offset. I do also not understand why the test fails only now though, and why pandas picks one code path to define the test case, and another to create the expected value. Best, Gert