[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-10-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/518


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-09-12 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r78475862
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JsonProcessor.java
 ---
@@ -30,6 +30,8 @@
 
   public static enum ReadState {
 END_OF_STREAM,
+JSON_RECORD_PARSE_ERROR,
--- End diff --

Would be helpful to add a comment to describe the meaning of these new 
states.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-07-14 Thread Subbu Srinivasan
Thanks chunhui-shi .

Is there a recommended way to test for count of results of a query?


On Thu, Jul 14, 2016 at 1:41 PM, chunhui-shi  wrote:

> Github user chunhui-shi commented on a diff in the pull request:
>
> https://github.com/apache/drill/pull/518#discussion_r70879554
>
> --- Diff:
> exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
> ---
> @@ -159,64 +164,91 @@ public void drill_3353() throws Exception {
>test("create table dfs_test.tmp.drill_3353 as select a from
> dfs.`${WORKING_PATH}/src/test/resources/jsoninput/drill_3353` where e =
> true");
>String query = "select t.a.d cnt from dfs_test.tmp.drill_3353 t
> where t.a.d is not null";
>test(query);
> -  testBuilder()
> -  .sqlQuery(query)
> -  .unOrdered()
> -  .baselineColumns("cnt")
> -  .baselineValues("1")
> -  .go();
> +  testBuilder().sqlQuery(query).unOrdered().baselineColumns("cnt")
> +  .baselineValues("1").go();
>  } finally {
>testNoResult("alter session set `store.json.all_text_mode` =
> false");
>  }
>}
>
> -  @Test // See DRILL-3476
> +  @Test
> +  // See DRILL-3476
>public void testNestedFilter() throws Exception {
>  String query = "select a from cp.`jsoninput/nestedFilter.json` t
> where t.a.b = 1";
>  String baselineQuery = "select * from
> cp.`jsoninput/nestedFilter.json` t where t.a.b = 1";
> -testBuilder()
> -.sqlQuery(query)
> -.unOrdered()
> -.sqlBaselineQuery(baselineQuery)
> +
> testBuilder().sqlQuery(query).unOrdered().sqlBaselineQuery(baselineQuery)
>  .go();
>}
>
> - @Test // See DRILL-4653
> -public void testSkippingInvalidJSONRecords() throws Exception {
> -try
> -{
> -  String set = "alter session set `" +
> ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG+ "` = true";
> -  String query = "select count(*) from
> cp.`jsoninput/DRILL-4653.json`";
> +  @Test
> +  // See DRILL-4653
> +  /* Test for CountingJSONReader */
> +  public void testCountingQuerySkippingInvalidJSONRecords() throws
> Exception {
> +try {
> +  String set = "alter session set `"
> +  + ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG + "`
> = true";
> +  String set1 = "alter session set `"
> +  +
> ExecConstants.JSON_READER_PRINT_INVALID_RECORDS_LINE_NOS_FLAG
> +  + "` = true";
> +  String query = "select count(*) from
> cp.`jsoninput/drill4653/file.json`";
> +  testNoResult(set);
> +  testNoResult(set1);
> +
> testBuilder().unOrdered().sqlQuery(query).sqlBaselineQuery(query).build()
> +  .run();
> +} finally {
> +  String set = "alter session set `"
> +  + ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG + "`
> = false";
>testNoResult(set);
> -  testBuilder()
> -  .unOrdered()
> -  .sqlQuery(query)
> -  .sqlBaselineQuery(query)
> -  .build().run();
>  }
> -finally
> -{
> -  String set = "alter session set `" +
> ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG+ "` = false";
> +  }
> +
> +  @Test
> +  // See DRILL-4653
> +  /* Test for CountingJSONReader */
> +  public void testCountingQueryNotSkippingInvalidJSONRecords() throws
> Exception {
> +try {
> +  String query = "select count(*) from
> cp.`jsoninput/drill4653/file.json`";
> +
> testBuilder().unOrdered().sqlQuery(query).sqlBaselineQuery(query).build()
> --- End diff --
>
> Should we compare with expected count here?
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-07-14 Thread chunhui-shi
Github user chunhui-shi commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r70879554
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -159,64 +164,91 @@ public void drill_3353() throws Exception {
   test("create table dfs_test.tmp.drill_3353 as select a from 
dfs.`${WORKING_PATH}/src/test/resources/jsoninput/drill_3353` where e = true");
   String query = "select t.a.d cnt from dfs_test.tmp.drill_3353 t 
where t.a.d is not null";
   test(query);
-  testBuilder()
-  .sqlQuery(query)
-  .unOrdered()
-  .baselineColumns("cnt")
-  .baselineValues("1")
-  .go();
+  testBuilder().sqlQuery(query).unOrdered().baselineColumns("cnt")
+  .baselineValues("1").go();
 } finally {
   testNoResult("alter session set `store.json.all_text_mode` = false");
 }
   }
 
-  @Test // See DRILL-3476
+  @Test
+  // See DRILL-3476
   public void testNestedFilter() throws Exception {
 String query = "select a from cp.`jsoninput/nestedFilter.json` t where 
t.a.b = 1";
 String baselineQuery = "select * from cp.`jsoninput/nestedFilter.json` 
t where t.a.b = 1";
-testBuilder()
-.sqlQuery(query)
-.unOrdered()
-.sqlBaselineQuery(baselineQuery)
+
testBuilder().sqlQuery(query).unOrdered().sqlBaselineQuery(baselineQuery)
 .go();
   }
 
- @Test // See DRILL-4653
-public void testSkippingInvalidJSONRecords() throws Exception {
-try
-{
-  String set = "alter session set `" + 
ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG+ "` = true";
-  String query = "select count(*) from cp.`jsoninput/DRILL-4653.json`";
+  @Test
+  // See DRILL-4653
+  /* Test for CountingJSONReader */
+  public void testCountingQuerySkippingInvalidJSONRecords() throws 
Exception {
+try {
+  String set = "alter session set `"
+  + ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG + "` = 
true";
+  String set1 = "alter session set `"
+  + ExecConstants.JSON_READER_PRINT_INVALID_RECORDS_LINE_NOS_FLAG
+  + "` = true";
+  String query = "select count(*) from 
cp.`jsoninput/drill4653/file.json`";
+  testNoResult(set);
+  testNoResult(set1);
+  
testBuilder().unOrdered().sqlQuery(query).sqlBaselineQuery(query).build()
+  .run();
+} finally {
+  String set = "alter session set `"
+  + ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG + "` = 
false";
   testNoResult(set);
-  testBuilder()
-  .unOrdered()
-  .sqlQuery(query)
-  .sqlBaselineQuery(query)
-  .build().run();
 }
-finally
-{
-  String set = "alter session set `" + 
ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG+ "` = false";
+  }
+
+  @Test
+  // See DRILL-4653
+  /* Test for CountingJSONReader */
+  public void testCountingQueryNotSkippingInvalidJSONRecords() throws 
Exception {
+try {
+  String query = "select count(*) from 
cp.`jsoninput/drill4653/file.json`";
+  
testBuilder().unOrdered().sqlQuery(query).sqlBaselineQuery(query).build()
--- End diff --

Should we compare with expected count here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-07-14 Thread chunhui-shi
Github user chunhui-shi commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r70879473
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -159,64 +164,91 @@ public void drill_3353() throws Exception {
   test("create table dfs_test.tmp.drill_3353 as select a from 
dfs.`${WORKING_PATH}/src/test/resources/jsoninput/drill_3353` where e = true");
   String query = "select t.a.d cnt from dfs_test.tmp.drill_3353 t 
where t.a.d is not null";
   test(query);
-  testBuilder()
-  .sqlQuery(query)
-  .unOrdered()
-  .baselineColumns("cnt")
-  .baselineValues("1")
-  .go();
+  testBuilder().sqlQuery(query).unOrdered().baselineColumns("cnt")
+  .baselineValues("1").go();
 } finally {
   testNoResult("alter session set `store.json.all_text_mode` = false");
 }
   }
 
-  @Test // See DRILL-3476
+  @Test
+  // See DRILL-3476
   public void testNestedFilter() throws Exception {
 String query = "select a from cp.`jsoninput/nestedFilter.json` t where 
t.a.b = 1";
 String baselineQuery = "select * from cp.`jsoninput/nestedFilter.json` 
t where t.a.b = 1";
-testBuilder()
-.sqlQuery(query)
-.unOrdered()
-.sqlBaselineQuery(baselineQuery)
+
testBuilder().sqlQuery(query).unOrdered().sqlBaselineQuery(baselineQuery)
 .go();
   }
 
- @Test // See DRILL-4653
-public void testSkippingInvalidJSONRecords() throws Exception {
-try
-{
-  String set = "alter session set `" + 
ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG+ "` = true";
-  String query = "select count(*) from cp.`jsoninput/DRILL-4653.json`";
+  @Test
+  // See DRILL-4653
+  /* Test for CountingJSONReader */
+  public void testCountingQuerySkippingInvalidJSONRecords() throws 
Exception {
+try {
+  String set = "alter session set `"
+  + ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG + "` = 
true";
+  String set1 = "alter session set `"
+  + ExecConstants.JSON_READER_PRINT_INVALID_RECORDS_LINE_NOS_FLAG
+  + "` = true";
+  String query = "select count(*) from 
cp.`jsoninput/drill4653/file.json`";
+  testNoResult(set);
+  testNoResult(set1);
+  
testBuilder().unOrdered().sqlQuery(query).sqlBaselineQuery(query).build()
--- End diff --

Should we verify the expected count against the result?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-07-14 Thread chunhui-shi
Github user chunhui-shi commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r70872218
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -56,11 +57,19 @@ public void trySimpleQueryWithLimit() throws Exception {
 test("select * from cp.`limit/test1.json` limit 10");
   }
 
-  @Test// DRILL-1634 : retrieve an element in a nested array in a repeated 
map.  RepeatedMap (Repeated List (Repeated varchar))
+  @Test
+  // DRILL-1634 : retrieve an element in a nested array in a repeated map.
+  // RepeatedMap (Repeated List (Repeated varchar))
   public void testNestedArrayInRepeatedMap() throws Exception {
 test("select a[0].b[0] from cp.`jsoninput/nestedArray.json`");
 test("select a[0].b[1] from cp.`jsoninput/nestedArray.json`");
-test("select a[1].b[1] from cp.`jsoninput/nestedArray.json`");  // 
index out of the range. Should return empty list.
+test("select a[1].b[1] from cp.`jsoninput/nestedArray.json`"); // 
index out
--- End diff --

comment format


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-07-14 Thread chunhui-shi
Github user chunhui-shi commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r70863798
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java
 ---
@@ -110,21 +118,29 @@ public void ensureAtLeastOneField(ComplexWriter 
writer) {
 emptyStatus.set(i, true);
   }
   if (i == 0 && !allTextMode) {
-// when allTextMode is false, there is not much benefit to 
producing all the empty
-// fields; just produce 1 field.  The reason is that the type of 
the fields is
-// unknown, so if we produce multiple Integer fields by default, a 
subsequent batch
-// that contains non-integer fields will error out in any case.  
Whereas, with
-// allTextMode true, we are sure that all fields are going to be 
treated as varchar,
-// so it makes sense to produce all the fields, and in fact is 
necessary in order to
+// when allTextMode is false, there is not much benefit to 
producing all
--- End diff --

Seems the line width changed here(line 121-132). Please reorganize the 
text. If possible keep the original text unchanged. The same in line 140-143


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread ssriniva123
Github user ssriniva123 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67431369
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +191,33 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
-//Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
-//  logger.debug("Wrote record.");
-  recordCount++;
-}else{
-//  logger.debug("Exiting.");
-  break outside;
-}
-
+outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
+try
+  {
+writer.setPosition(recordCount);
--- End diff --

Aman,
maven checkstyle:checkstyle did not report any errors before I did my last
check in. I have changed to reflect 2 spaces for indendation.

On Thu, Jun 16, 2016 at 2:22 PM, Aman Sinha 
wrote:

> In
> 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
> :
>
> > -  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
> > -writer.setPosition(recordCount);
> > -write = jsonReader.write(writer);
> > -
> > -if(write == ReadState.WRITE_SUCCEED) {
> > -//  logger.debug("Wrote record.");
> > -  recordCount++;
> > -}else{
> > -//  logger.debug("Exiting.");
> > -  break outside;
> > -}
> > -
> > +outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
> > +try
> > +  {
> > +writer.setPosition(recordCount);
>
> seems this is still doing indent of 4. We use 2 spaces (see
> https://drill.apache.org/docs/apache-drill-contribution-guidelines/
> scroll down to Step 2). Did it pass the mvn command line build without
> checkstyle violations ?
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> 
,
> or mute the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67426795
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -179,4 +180,43 @@ public void testNestedFilter() throws Exception {
 .sqlBaselineQuery(baselineQuery)
 .go();
   }
+
+ @Test // See DRILL-4653
+public void testSkippingInvalidJSONRecords() throws Exception {
+try
+{
+String set = "alter session set `" + 
ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG+ "` = true";
--- End diff --

these should be indented inside the try block with 2 spaces.   It is best 
to set the indent level in your IDE (I can help with Eclipse if you are using 
it;  if you are using IntelliJ I can find out from other developers using 
IntelliJ). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67426381
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +191,33 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
-//Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
-//  logger.debug("Wrote record.");
-  recordCount++;
-}else{
-//  logger.debug("Exiting.");
-  break outside;
-}
-
+outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
+try
+  {
+writer.setPosition(recordCount);
--- End diff --

seems this is still doing indent of 4.  We use 2 spaces (see 
https://drill.apache.org/docs/apache-drill-contribution-guidelines/   scroll 
down to Step 2).   Did it pass the mvn command line build without checkstyle 
violations ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67390934
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -116,6 +117,7 @@ public void testMixedNumberTypes() throws Exception {
   .jsonBaselineFile("jsoninput/mixed_number_types.json")
   .build().run();
 } catch (Exception ex) {
+  ex.printStackTrace();
--- End diff --

Not a good idea to print stack trace in unit tests. The output of our unit 
tests is already too verbose.
Use junit.fail with the message from the exception instead?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67390506
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -179,4 +181,28 @@ public void testNestedFilter() throws Exception {
 .sqlBaselineQuery(baselineQuery)
 .go();
   }
+
+
+ @Test // See DRILL-4653
+  public void testSkippingInvalidJSONRecords() throws Exception {
--- End diff --

For both these tests could you pls use the testBuilder() framework ?  This 
is the recommended way to write the unit tests .. you can see one of the other 
tests in this file.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389956
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -116,6 +117,7 @@ public void testMixedNumberTypes() throws Exception {
   .jsonBaselineFile("jsoninput/mixed_number_types.json")
   .build().run();
 } catch (Exception ex) {
+  ex.printStackTrace();
--- End diff --

not sure why this printStackTrace was added in a different test from the 
ones that you added...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389846
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +194,37 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
 //Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
+   // try
+   // {
+  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
+  try{
+writer.setPosition(recordCount);
+write = jsonReader.write(writer);
+if(write == ReadState.WRITE_SUCCEED) {
 //  logger.debug("Wrote record.");
-  recordCount++;
-}else{
+  recordCount++;
+}else{
 //  logger.debug("Exiting.");
-  break outside;
-}
-
+  break outside;
+}
   }
-
-  jsonReader.ensureAtLeastOneField(writer);
-
+  catch(Exception ex)
+  {
+   ++parseErrorCount;
--- End diff --

the indentations seem to be off here as well as other places.. can you make 
sure the indentations match the rest of the code ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389726
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +194,37 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
 //Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
+   // try
+   // {
+  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
+  try{
+writer.setPosition(recordCount);
+write = jsonReader.write(writer);
+if(write == ReadState.WRITE_SUCCEED) {
 //  logger.debug("Wrote record.");
-  recordCount++;
-}else{
+  recordCount++;
+}else{
 //  logger.debug("Exiting.");
-  break outside;
-}
-
+  break outside;
+}
   }
-
-  jsonReader.ensureAtLeastOneField(writer);
-
+  catch(Exception ex)
--- End diff --

minor style convention: can you put the catch() on the previous line to 
match the closing paren 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389362
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +194,37 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
 //Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
+   // try
--- End diff --

remove these commented lines


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389073
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -135,6 +135,9 @@
   BooleanValidator JSON_EXTENDED_TYPES = new 
BooleanValidator("store.json.extended_types", false);
   BooleanValidator JSON_WRITER_UGLIFY = new 
BooleanValidator("store.json.writer.uglify", false);
   BooleanValidator JSON_WRITER_SKIPNULLFIELDS = new 
BooleanValidator("store.json.writer.skip_null_fields", true);
+  String JSON_READER_SKIP_MALFORMED_RECORDS_FLAG = 
"store.json.reader.skip_malformed_records";
--- End diff --

Can you change this to 'skip_invalid_records' such that the name is 
somewhat consistent with the future similar option in DRILL-3764.  In the 
future the json option would likely be subsumed by the new global option. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-16 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67388118
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -39,6 +40,7 @@
 import org.apache.drill.exec.vector.complex.fn.JsonReader;
 import org.apache.drill.exec.vector.complex.impl.VectorContainerWriter;
 import org.apache.hadoop.fs.Path;
+import org.apache.parquet.Log;
--- End diff --

Not sure why the parquet.log is included in the json reader


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-15 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67267511
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -217,11 +228,11 @@ public int next() {
   updateRunningCount();
   return recordCount;
 
-} catch (final Exception e) {
-  handleAndRaise("Error parsing JSON", e);
-}
+   // } catch (final Exception e) {
+   //   handleAndRaise("Error parsing JSON", e);
+   // }
 // this is never reached
-return 0;
+//return 0;
--- End diff --

Uncomment this (best practice since function has a return type)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-15 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67267273
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -64,6 +64,8 @@
   private final boolean enableAllTextMode;
   private final boolean readNumbersAsDouble;
   private final boolean unionEnabled;
+  private int parseErrorCount;
--- End diff --

This should be long instead of int since parseErrorCount is cumulative..so 
in the worst case it could be as large as runningRecordCount.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #518: DRILL-4653.json - Malformed JSON should not stop th...

2016-06-10 Thread ssriniva123
GitHub user ssriniva123 opened a pull request:

https://github.com/apache/drill/pull/518

DRILL-4653.json - Malformed JSON should not stop the entire query from 
progressing

https://issues.apache.org/jira/browse/DRILL-4653

- The default is to stop processing as is today when JSON parser encounters 
an exception
- Setting store.json.reader.skip_malformed_records will ensure that query 
progresses after
skipping the bad records
- Added two unit tests
- Also did testing after deploying the new build: Both positive and 
negative tests were done.
- Negative test result:
org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: 
Error parsing JSON - Unexpected character ('{' (code 123)): was expecting comma 
to separate OBJECT entries

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssriniva123/drill master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #518


commit 4fc70faf0e5ad5d434b944d084bc0b0e90d41c39
Author: Subbu Srinivasan 
Date:   2016-06-10T22:58:49Z

Fixes for DRILL-4653.json




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---