Hi,

Thanks for your reply,
I actually want to submit my changes, but I am being denied to push any
changes to the Drill repo. How to do the pull request in Git ? Are there
any permissions required to get beforehand pushing to the repo ?


Le mer. 28 déc. 2022 à 15:46, Charles Givre <cgi...@gmail.com> a écrit :

> Hi Marc,
> Thanks for this.  Here's the thing... Let's say you have json that looks
> like this:
>
> {
>         "foo":null
> },{
>         "foo": 3.5
> }
>
> If you take the approach that `null` is treated like a string, you will
> get a schema change exception when you read the next row.  Our current
> approach is to basically ignore fields that Drill cannot figure out what
> they are in terns of data type.  Once Drill encounters a data type, it will
> then assign a data type to that column.  See the example below which is
> from DRILL-5033.  I added a second row to demonstrate what happens once
> Drill is able to determine a data type.  Note that for the columns with a
> defined value in the second row, Drill returns 'null' as the value.
>
>
> [{
> "intKey" : null,
> "bgintKey": null,
> "strKey": null,
> "boolKey": null,
> "fltKey": null,
> "dblKey": null,
> "timKey": null,
> "dtKey": null,
> "tmstmpKey": null,
> "intrvldyKey": null,
> "intrvlyrKey": null
> },
> {
> "intKey" : 1,
> "bgintKey": 3666565464,
> "strKey": "hithere",
> "boolKey": true,
> "fltKey": 3.5,
> "dblKey": 4.2,
> "timKey": null,
> "dtKey": null,
> "tmstmpKey": null,
> "intrvldyKey": null,
> "intrvlyrKey": null
> }]
>
>
> select * from dfs.test.`nulls.json`;
>
> +--------+---------------+---------+---------+--------+--------+--------+-------+-----------+-------------+-------------+
> | intKey |   bgintKey    | strKey  | boolKey | fltKey | dblKey | timKey |
> dtKey | tmstmpKey | intrvldyKey | intrvlyrKey |
>
> +--------+---------------+---------+---------+--------+--------+--------+-------+-----------+-------------+-------------+
> | null   | null          | null    | null    | null   | null   | []     |
> []    | []        | []          | []          |
> | 1.0    | 3.666565464E9 | hithere | true    | 3.5    | 4.2    | []     |
> []    | []        | []          | []          |
>
> +--------+---------------+---------+---------+--------+--------+--------+-------+-----------+-------------+-------------+
> 2 rows selected (0.232 seconds)
>
> You are definitely welcome to submit a pull request, however this area is
> extremely complex, and I'd suspect that what you propose will break other
> unit tests.  Another option which you might not be aware of is providing a
> schema.  If you do that from the beginning, then Drill will know what data
> types to expect.
>
> Best,
> -- C
>
>
> > On Dec 28, 2022, at 8:57 AM, marc nicole <mk1853...@gmail.com> wrote:
> >
> > Hello Drillers :)
> >
> > I came across the aforementioned bug (DRILL-5033) and wanted to
> contribute.
> > My attempt is to consider a *null *token as a *string *and print the
> "null"
> > as the column value instead of omitting the key in the output
> > resultset, details
> > of the fix attempt is below:
> >
> >
> > *1)* In JsonReader.java (java-exec/drill-exec/vector/complex/fn/) at line
> > 283 i add the following:
> >
> >> ...
> >> case VALUE_NULL:
> >>          // handle null as string
> >>          handleString(parser, map, fieldName);
> >>          break;
> >> ...
> >
> >
> > *2)* then at line 415 the handleString() becomes:
> >
> > private void handleString(JsonParser parser, MapWriter writer, String
> >> fieldName) throws IOException {
> >>    try {
> >>     // added the following if
> >>      if (parser.nextToken() == VALUE_NULL)
> >>        writer.varChar(fieldName)
> >>          .writeVarChar(0, workingBuffer.prepareVarCharHolder("null"),
> >> workingBuffer.getBuf());
> >>      else
> >>      writer.varChar(fieldName)
> >>          .writeVarChar(0,
> >> workingBuffer.prepareVarCharHolder(parser.getText()),
> >> workingBuffer.getBuf());
> >>    } catch (IllegalArgumentException e) {
> >>      if (parser.getText() == null || parser.getText().isEmpty()) {
> >>       // return;
> >>      }
> >>      throw e;
> >>    }
> >>  }
> >
> >
> >
> > Is this a possible fix to the mentioned bug?
> > If yes should i pull request ?
> >
> > Thanks.
>
>

Reply via email to