Hi, I could use some input. I am periodically incrementing an arrow table with additional arrow data. The arrow data is retrieved from the same place, so same schema. I am creating a fresh table from the incoming buffer with some client side column additions, and then concating the new table onto the primary table.
This works fantastic, until I use the countBy method - which looks to only use the last batch dictionary which comes from the most recent poll. This dictionary might account for 1% of the data in the table, and thus is definitely not a delta. What's my next step? I'm close to just fixing the countBy function, but that doesn't solve the problem at the core - the last batch dictionary is supposed to be the most complete delta of the previous batches. Any use of the batch dictionaries will be invalid as they are only reflective of their batch. I've tried - concating batches/chunks and retaining all buffers from every poll iteration and loading them at the same via a batchreader.all (hoping some logic i've not seen would unify batch dictionaries). Thanks, -Dan Lustig
