> On March 19, 2015, 7:40 a.m., Aman Sinha wrote: > > exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JsonProcessor.java, > > line 37 > > <https://reviews.apache.org/r/32223/diff/1/?file=899460#file899460line37> > > > > Is it necessary to have this method in the interface ? The name > > suggests that implementors should ensure at least 1 field/column but the > > counting reader does not actually do that.
I would imagine this be useful for an empty json input. I will implement this in counting reader. > On March 19, 2015, 7:40 a.m., Aman Sinha wrote: > > exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/reader/CountingJsonReader.java, > > line 36 > > <https://reviews.apache.org/r/32223/diff/1/?file=899462#file899462line36> > > > > The default JsonReader (which is used when skip-all is false) has a > > initial while loop to iterate over the tokens; is that not needed here > > because you are expecting to be either at end-of-stream or at the beginning > > of a record ? I am wondering what happens where a single large record (with > > either many fields or a large string field) spans across batch boundary. (I > > am actually not completely sure if that is allowed, so let me know if that > > situation is not going to occur). i) I am not sure why this initial loop in the original reader is useful. ii) I think parser works on entire json stream across batch boundaries. Wide records used to be a problem before auto reallocation came in now we do re-alloc as needed. Besides since we are not particularly interested in fields and just counting, footprint should be small. - Hanifi ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32223/#review77027 ----------------------------------------------------------- On March 18, 2015, 11:34 p.m., Hanifi Gunes wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/32223/ > ----------------------------------------------------------- > > (Updated March 18, 2015, 11:34 p.m.) > > > Review request for drill, Aman Sinha and Parth Chandra. > > > Bugs: DRILL-2193 > https://issues.apache.org/jira/browse/DRILL-2193 > > > Repository: drill-git > > > Description > ------- > > DRILL-2193: implement fast count / skip-all semantics for JSON reader > > This patch introduces an abstraction for JSON processing and implements a > efficient counting JSON reader if query is in skip-all mode(see DRILL-2358). > > > Diffs > ----- > > > exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java > c343177a719b5f36f51bcb2f84d68518ba1ae02f > > exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JsonProcessor.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/reader/BaseJsonProcessor.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/reader/CountingJsonReader.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java > cc5c8af63c6383eb8d2e28a409a3c055bf5cc737 > > Diff: https://reviews.apache.org/r/32223/diff/ > > > Testing > ------- > > unit + regression > > > Thanks, > > Hanifi Gunes > >
