There are some other tools like csvkit and csv lint like mentioned here,
that might help: http://jexp.de/blog/2014/10/load-cvs-with-success/

Did you try the single escaping \"

For the time being, perhaps a simple sed script like "sed -e 's/"/''/g'
might help? replace a double quote with two single quotes?



On Mon, Nov 3, 2014 at 12:32 AM, David Bigelow <
[email protected]> wrote:

> Thanks for getting back so quick.
>
> Please don't assume everything is fully qualified regarding CSV files -
> that will cause many problems with the CSV importer in cypher for many
> customers.
>
> Legacy systems that we are getting data from do NOT export with qualifiers
> around text - of often we get just a delimited field break character.  Many
> times exports from legacy systems (think green screens) have "fixed" width
> fields and NO qualifiers for data type.  So you may get a 20 character
> field with only 3 characters of text in it routinely.  A good 'sed' script
> can fix that to squeeze the air out of the file on unix, BUT adding in text
> qualifiers BLOATS the file on larger data sets to get it to go through the
> neo4j csv importer (not to mention it can be complicated to do well).
>
> As a hack, we have used sqlite3-shell to create an in-memory sql database
> to import CSV files, and then clean them up and then re-export them to a
> more formatted CSV file - BUT, even that export has it's own assumptions as
> what a "value" is. -- sqlite will often write single word/numbers values
> WITHOUT double quotes and multi-word/number values as strings.
>
> It would  be MUCH BETTER to have the cypher CSV importer consider the
> FIELDTERMINIATOR as the separator between field values.  and then
> optionally qualify if the value has a text qualifier or not (e.g.
> double-quotes around the value by default).  Hence my request/suggestion
> for a *TEXTQUALIFIER true/false option.*
>
> This is a BIG issue. (huge, if you are trying to migrate legacy system
> data into neo4j and do daily updates while legacy systems are kept alive
> and feeding updates to neo4j).
>
> Dave
>
> Always open to suggestion on this, but I have pounded my head on this
> issue for the better part of a weekend trying to figure out what exactly is
> causing my problems....
>
>
>
>
> On Sunday, November 2, 2014 5:58:35 PM UTC-5, Michael Hunger wrote:
>>
>> Does using just a doubled double quote "" or an escape quote \" help?
>>
>> The common CSV idioms indicate that as soon as you have special stuff in
>> your text values like quotes or newlines and such, you _must_ quote them.
>>
>> Which tool created this unusual CSV format?
>>
>> Michael
>>
>> On Sun, Nov 2, 2014 at 10:52 PM, David Bigelow <davidh...@
>> simplifiedlogic.com> wrote:
>>
>>> This was a painful discovery.
>>>
>>> It appears that neo4j is EXPECTING Quoted content in a CSV file no
>>> matter what.
>>>
>>> If you send data that does NOT have Text Qualifiers as an option from
>>> another database source to generate your CSV File but DOES have Double
>>> Quotes within the content, neo4j will consider the double-quote as the
>>> start of a separated content REGARDLESS of your FIELDTERMINIATOR setting.
>>>
>>> For example (4 lines of text to be added to a single node property)
>>>
>>> ABC1203|1|Length: 2" from left
>>> ABC1203|2|Textured
>>> ABC1203|3|Depth: 6" from angle
>>> ABC1203|4|Thickness: 8" from edge
>>> ABC1203|5|Paint: 0.02" all around
>>>
>>> Might look like the following after import:
>>>  from left ABC1203|2|Textured ABC1203|3|Depth: 6  from edge
>>> ABC1203|5|Paint: 0.02" all around
>>>
>>>
>>> The problem appears to be that neo4j is expecting values to be contained
>>> in Quotes and Properly Escaped:
>>> ABC1203|1|"Length: 2\" from left"
>>> ABC1203|2|"Textured"
>>> ABC1203|3|"Depth: 6\" from angle"
>>> ABC1203|4|"Depth: 8\" from edge"
>>> ABC1203|5|"Paint: 0.02\" all around"
>>>
>>> This will get you more like this:
>>> Length: 2" from left Height: 2" from base Depth: 6" from angle
>>> Thickness: 8" from edge Paint: 0.02" all around
>>>
>>>
>>> This is a bit of problem relative to importing CSV files...  Not all
>>> systems will write data out the same way for CSV.
>>>
>>> For example, sqlite3-shell will write out the above data like this -
>>> notice how a single value is NOT Text Qualified, and it double escapes
>>> Double Quotes.
>>>
>>> ABC1203|1|"Length: 2"" from left"
>>> ABC1203|2|Textured
>>> ABC1203|3|"Depth: 6"" from angle"
>>> ABC1203|4|"Thickness: 8"" from edge"
>>> ABC1203|5|"Paint: 0.02"" all around"
>>>
>>> *Proposed Solution:*
>>> Include a *TEXTQUALIFIER* *true/false *option so that the default rule
>>> for assuming double quotes around values can be disabled more easily and
>>> also assume that content between the FIELDTERMINIATOR are qualified by the
>>> FIELDTERMINIATOR that is specified by the user in the header of the CSV
>>> import cypher.
>>>
>>> (note: this is a hacky example - I can come up with something more
>>> definitive if necessary - but I have been fighting this quite a bit).
>>>
>>> Dave
>>>
>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to