GitHub user kunal0137 opened a pull request:

    https://github.com/apache/drill/pull/1130

    Drill 3878

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/magpierre/drill DRILL-3878

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1130.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1130
    
----
commit 844f34a16e75719535ff94c54d5337746ea18c20
Author: MPierre <magnus.pierre@...>
Date:   2015-11-05T14:42:06Z

    Initial commit
    
    XML support in Apache Drill

commit 592b3af06c2ff45198136577561f2ec1f7caaee0
Author: MPierre <magnus.pierre@...>
Date:   2015-11-05T21:21:42Z

    Fixed some minor outstanding bugs
    
    EasyRecordReader have a new field userName, and I forgot to change
    jsonProcessor to protected from private.

commit 8fad811edab43d3499b41bb66cb419248d11208f
Author: MPierre <magnus.pierre@...>
Date:   2015-11-09T08:59:08Z

    Merge remote-tracking branch 'apache/master' into DRILL-3878

commit 38f4884fe9b8456c1cde5de44c1e54177301a974
Author: MPierre <magnus.pierre@...>
Date:   2016-03-16T11:33:15Z

    Syncing to latest release of drill

commit 909c5dec8bdb01bfe0ed358ebc64c959785738df
Author: MPierre <magnus.pierre@...>
Date:   2016-03-16T11:34:10Z

    syncing to latest release of drill

commit 597d9657d613fa35df2c10dff23681545b13e531
Author: MPierre <magnus.pierre@...>
Date:   2016-03-18T08:55:51Z

    Cleaned up deliver
    
    Cleaned up the output generated by the SAX Parser, and removed all
    unnecessary code.

commit 0cfaa31ab9af89833417288a290d21d0ce88c4ac
Author: MPierre <magnus.pierre@...>
Date:   2016-03-18T10:29:51Z

    Merge remote-tracking branch 'apache/master' into DRILL-3878

commit aaaff05eb921125ad64854c89c179292c4441fb7
Author: MPierre <magnus.pierre@...>
Date:   2016-03-24T13:05:53Z

    Adjusted output from Parser to fit Drill better
    
    I have adjusted the SAX parser to produce JSON that Drill likes. Among
    the things corrected is to remove empty objects from the tree built.
    And to consolidate repeating values in arrays.

commit ba19a356d850224c01b9e807183377b46cf7e545
Author: MPierre <magnus.pierre@...>
Date:   2016-03-24T13:10:57Z

    Fixed small typo

commit 8ba6705be42c7847d469611ab070b869e0c76d8c
Author: MPierre <magnus.pierre@...>
Date:   2016-03-24T21:17:30Z

    Further enhancements of the output format to fit Drill

commit e2273f13b8e0136a33c1576c4667f16e23e1631c
Author: MPierre <magnus.pierre@...>
Date:   2016-03-24T21:22:41Z

    Removed comment

commit c1b6ff8375a7e3c8161167d1a5f2b34ba165e750
Author: MPierre <magnus.pierre@...>
Date:   2016-03-29T12:48:53Z

    Merge remote-tracking branch 'apache/master' into DRILL-3878

commit aacdec286781bc09dfc770044d4468ad7d83a6fc
Author: MPierre <magnus.pierre@...>
Date:   2016-03-29T18:24:04Z

    Corrected if style violations

commit 980be53b7192a8b09f5932eb31b3a70a17873300
Author: MPierre <magnus.pierre@...>
Date:   2016-04-22T11:53:49Z

    Addressing data volume to JSON Parser
    
    The sax parser is streaming through the files read using an events push
    model. This filter is run as part of the SAX events handler and
    pre-qualifies events i.e. if the data is relevant for the query or not
    and drops events not related to the query result needed. This leads to
    less volume to convert from XML to JSON and less volume to send to the
    JSON Record Reader and the ability to get specific information from
    large documents without exploding the memory.

commit 2fe0aaf80bd4a3e616bc3c9c8d46c26472232bb0
Author: MPierre <magnus.pierre@...>
Date:   2016-04-22T11:59:26Z

    Embedding JSONRecordReader instead of extending
    
    Based on feedback, I have embedded the JSON Record Reader instead of
    extending it. SInce I need to call a constructor previously in private
    mode, as well as calling methods previously private I had to change
    them to public. I have restored the internal variables previously
    turned to protected mode, to private. The XML Record Reader is now
    calling the pre-filtering class in order to lessen the total volume to
    be handled in both the XML To JSON parser and the JSON Record Reader

commit 8f4ca71183ff3f688fb0b2460064227ac8ebeb7e
Author: MPierre <magnus.pierre@...>
Date:   2016-04-22T12:03:01Z

    Improve output format and added EndDocument method
    
    Adding EndDocument to support the XML Filter, otherwise the stream is
    not closed and the filtering appears as hanging, and renamed generated
    arrays from _array to :drill_array to differentiate vs existing tag
    names that might exist in the document already.

----


---

Reply via email to