> On March 19, 2015, 7:40 a.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JsonProcessor.java,
> >  line 37
> > <https://reviews.apache.org/r/32223/diff/1/?file=899460#file899460line37>
> >
> >     Is it necessary to have this method in the interface ?  The name 
> > suggests that implementors should ensure at least 1 field/column but the 
> > counting reader does not actually do that.

I would imagine this be useful for an empty json input. I will implement this 
in counting reader.


> On March 19, 2015, 7:40 a.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/reader/CountingJsonReader.java,
> >  line 36
> > <https://reviews.apache.org/r/32223/diff/1/?file=899462#file899462line36>
> >
> >     The default JsonReader (which is used when skip-all is false) has a 
> > initial while loop to iterate over the tokens; is that not needed here 
> > because you are expecting to be either at end-of-stream or at the beginning 
> > of a record ? I am wondering what happens where a single large record (with 
> > either many fields or a large string field) spans across batch boundary. (I 
> > am actually not completely sure if that is allowed, so let me know if that 
> > situation is not going to occur).

i) I am not sure why this initial loop in the original reader is useful.

ii) I think parser works on entire json stream across batch boundaries. Wide 
records used to be a problem before auto reallocation came in now we do 
re-alloc as needed. Besides since we are not particularly interested in fields 
and just counting, footprint should be small.


- Hanifi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32223/#review77027
-----------------------------------------------------------


On March 18, 2015, 11:34 p.m., Hanifi Gunes wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32223/
> -----------------------------------------------------------
> 
> (Updated March 18, 2015, 11:34 p.m.)
> 
> 
> Review request for drill, Aman Sinha and Parth Chandra.
> 
> 
> Bugs: DRILL-2193
>     https://issues.apache.org/jira/browse/DRILL-2193
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> DRILL-2193: implement fast count / skip-all semantics for JSON reader
> 
> This patch introduces an abstraction for JSON processing and implements a 
> efficient counting JSON reader if query is in skip-all mode(see DRILL-2358).
> 
> 
> Diffs
> -----
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
>  c343177a719b5f36f51bcb2f84d68518ba1ae02f 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JsonProcessor.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/reader/BaseJsonProcessor.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/reader/CountingJsonReader.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java
>  cc5c8af63c6383eb8d2e28a409a3c055bf5cc737 
> 
> Diff: https://reviews.apache.org/r/32223/diff/
> 
> 
> Testing
> -------
> 
> unit + regression
> 
> 
> Thanks,
> 
> Hanifi Gunes
> 
>

Reply via email to