Hi all,

So it turns out that the problem was twofold. The null option was part of
the fix, but there was another problem. I accidentally had two duplicate
primary keys in my CSV file. Once changing that and adding the null option
the issue resolved and all entries were loaded into the database. Thank you
all so much for your help and your time.

On Thu, May 12, 2022 at 2:27 PM Akil Sevim <[email protected]> wrote:

> Hi Nicholas,
>
> To be on the safe side, can you please use the load query below (Bold text
> shows my additions). And again, to make sure everything is correct, please
> drop and create again the dataset before running this load.
>
> load dataset csv_set using localfs
>       (("path"="127.0.0.1:///Users/nicholassantini/Downloads/test 2.csv"),
> *("input-format"="text-input-format")*,("format"="delimited-text"),
> *("delimiter"=",")*);
>
> On Thu, May 12, 2022 at 2:03 PM Nicholas Santini <[email protected]> wrote:
>
>> Thank you! Unfortunately with the internal dataset the null parameter
>> doesn't change anything and with the external dataset it only adds a
>> warning that it cannot parse the null value:
>>
>> WARNING: Code: 1 "HYR0124: Parsing error at 
>> /Users/nicholassantini/Downloads/test 2.csv line 2 field 3: empty value"
>>
>>
>> On Thu, May 12, 2022 at 11:33 AM Ali Alsuliman <[email protected]>
>> wrote:
>>
>>> Nicholas,
>>>
>>> If you are using Mike's DDL, here is what you need to do:
>>> CREATE EXTERNAL DATASET csv_set(csv_type) USING localfs
>>> (("path"="127.0.0.1:///Users/nicholassantini/Downloads/test 2.csv"),
>>> ("format"="delimited-text"), ("delimiter"=","), ("null"= "") );
>>>
>>> If you are using your load statement, try this and see if it works:
>>> LOAD DATASET csv_set USING localfs
>>> (("path"="127.0.0.1:///Users/nicholassantini/Downloads/test 2.csv"),
>>> ("format"="delimited-text"), ("null"= "") );
>>>
>>> Thanks
>>> Ali
>>>
>>> On Thu, May 12, 2022 at 12:11 AM Nicholas Santini <[email protected]>
>>> wrote:
>>>
>>>> As I'm still learning SQL I'm not sure where to specify the ("null":
>>>> "") option. I didn't want to ask here since it seems to be a very
>>>> rudimentary question but I have yet to figure it out.
>>>>
>>>> On Wed, May 11, 2022 at 8:25 AM Michael Carey <[email protected]>
>>>> wrote:
>>>>
>>>>> @Nicholas:  Did this solve it, hopefully?
>>>>> On 5/9/22 7:10 PM, Ali Alsuliman wrote:
>>>>>
>>>>> Nicholas,
>>>>> The parser does not know how to handle (or rather what value to
>>>>> produce for) empty values when the field is numeric as in your case
>>>>> (int32). That's why it's complaining. The parser cannot produce 0 for 
>>>>> empty
>>>>> values, for example.
>>>>>
>>>>> Try specifying the option: ("null": ""). This option tells the parser
>>>>> to produce NULL for empty values.
>>>>>
>>>>> Thanks
>>>>> Ali
>>>>>
>>>>> On Sat, May 7, 2022 at 10:38 AM Nicholas Santini <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Thank you so much for providing that example code. I am really
>>>>>> learning to swim in the deep end here as I have no experience with SQL.
>>>>>> Unfortunately, when I run your snippet I get the following error
>>>>>>
>>>>>> WARNING: Code: 1 "HYR0124: Parsing error at 
>>>>>> /Users/nicholassantini/Downloads/test 2.csv line 2 field 3: empty value"
>>>>>>
>>>>>>
>>>>>> On Thu, May 5, 2022 at 7:40 PM Mike Carey <[email protected]> wrote:
>>>>>>
>>>>>>> PS - Here is a snippet from my approach to grading (I use SQL++)
>>>>>>> based on SQL++'ing against CSV data:
>>>>>>>
>>>>>>> CREATE EXTERNAL DATASET Grades(GradeRecord) USING localfs
>>>>>>>
>>>>>>> (("path"="localhost:///Users/mikejcarey/Desktop/Teaching/CS122a/CS122AStudentData2021.csv"),
>>>>>>>        ("format"="delimited-text"), ("delimiter"=","));
>>>>>>>
>>>>>>> For you I guess this would be:
>>>>>>>
>>>>>>> CREATE EXTERNAL DATASET csv_set(csv_type) USING localfs
>>>>>>> (("path"="127.0.0.1:///Users/nicholassantini/Downloads/test 2.csv"),
>>>>>>>        ("format"="delimited-text"), ("delimiter"=","));
>>>>>>>
>>>>>>> Try that?
>>>>>>>
>>>>>>>
>>>>>>> On 5/5/22 2:11 PM, Nicholas Santini wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Im using the following code to load a csv file into my database,
>>>>>>> However, any time there is an empty field it seems to skip the entire 
>>>>>>> row.
>>>>>>> For example test,1,3 works but test1,1,  doesn't create an entry. Does
>>>>>>> anyone know why it does this or how to fix it?
>>>>>>>
>>>>>>>
>>>>>>> drop dataverse csv if exists;
>>>>>>> create dataverse csv;
>>>>>>> use csv;
>>>>>>>
>>>>>>> create type csv_type as {
>>>>>>>     Cross_Street: string,
>>>>>>>     lat: int32,
>>>>>>>     long: int32?
>>>>>>> };
>>>>>>>
>>>>>>> create dataset csv_set (csv_type)
>>>>>>>   primary key lat;
>>>>>>>
>>>>>>>
>>>>>>>   load dataset csv_set using localfs
>>>>>>>       (("path"="127.0.0.1:///Users/nicholassantini/Downloads/test
>>>>>>> 2.csv"),
>>>>>>>        ("format"="delimited-text"));
>>>>>>>
>>>>>>>
>>>>>>>  SELECT *
>>>>>>>      FROM csv_set csv_type
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>>
>>>
>>> --
>>> Regards,
>>>
>>

Reply via email to