I am writing a unit test to compare that a Pandas DataFrame made by Arrow
is equal to one constructed directly with data. The timestamp values are a
Python datetime object with a timezone tzinfo object. When I compare the
results, the values are equal but the schema is not. Using arrow the type
is "datetime64[ns]" and without it is "object." Without a tzinfo, the
types match but I do need it there for the conversion with Arrow data. I
could just replace the tzinfo for the Pandas DataFrame, it is a naive
timezone with utcoffset=None. Does anyone know another way to produce
compatible types? I do need the data to be compatible with Spark too.
Hopefully this makes sense, I could attach some code if that would help,
thanks! Here is a sample of the data:
class NaiveTZ(tzinfo):
def utcoffset(self, date_time):
return None
def dst(self, date_time):
return None
data = {"timestamp_t": [datetime(2011, 1, 1, 1, 1, 1, tzinfo=NaiveTZ())]}
pd.DataFrame(data)