[GitHub] [drill] paul-rogers commented on issue #2298: Support the UTC formatter in the JSON Reader

GitBox Sat, 21 Aug 2021 13:51:21 -0700


paul-rogers commented on issue #2298:
URL: https://github.com/apache/drill/issues/2298#issuecomment-903174842



   Hi @luocooong, as it turns out, the stack trace shows the old JSON reader. 
The key is the package: `org.apache.drill.exec.vector.complex.fn.JsonReader`. 
The new one is in `org.apache.drill.exec.store.easy.json`.
   
   The old one was meant to be an independent module, it seems, part of the 
Value Vector package. The new one builds on top of the vector accessors and EVF.
   
   The key limitation of the old one (aside from the date issue you hit) is 
that the old one has no notion of vector sizes. It is based on a fixed record 
count. If your records are big, the old one will allocate an excessive amount 
of direct memory. In an extreme case, it can allocate vectors in excess of 
16GB, which will cause direct memory fragmentation and eventual OOM errors. The 
memory issue was the original reason for building the vector accessor mechanism 
which is the heart of the EVF.
   
   Now, I'm sure you just need the Mongo reader to work and are not tasked with 
revising it to use the new JSON parser. So, a temporary solution is simply to 
ensure that the old reader follows the same Mongo date/time specs as the new 
reader does. Links to the Mongo specs are in the note above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [drill] paul-rogers commented on issue #2298: Support the UTC formatter in the JSON Reader

Reply via email to