There are some other tools like csvkit and csv lint like mentioned here, that might help: http://jexp.de/blog/2014/10/load-cvs-with-success/
Did you try the single escaping \" For the time being, perhaps a simple sed script like "sed -e 's/"/''/g' might help? replace a double quote with two single quotes? On Mon, Nov 3, 2014 at 12:32 AM, David Bigelow < [email protected]> wrote: > Thanks for getting back so quick. > > Please don't assume everything is fully qualified regarding CSV files - > that will cause many problems with the CSV importer in cypher for many > customers. > > Legacy systems that we are getting data from do NOT export with qualifiers > around text - of often we get just a delimited field break character. Many > times exports from legacy systems (think green screens) have "fixed" width > fields and NO qualifiers for data type. So you may get a 20 character > field with only 3 characters of text in it routinely. A good 'sed' script > can fix that to squeeze the air out of the file on unix, BUT adding in text > qualifiers BLOATS the file on larger data sets to get it to go through the > neo4j csv importer (not to mention it can be complicated to do well). > > As a hack, we have used sqlite3-shell to create an in-memory sql database > to import CSV files, and then clean them up and then re-export them to a > more formatted CSV file - BUT, even that export has it's own assumptions as > what a "value" is. -- sqlite will often write single word/numbers values > WITHOUT double quotes and multi-word/number values as strings. > > It would be MUCH BETTER to have the cypher CSV importer consider the > FIELDTERMINIATOR as the separator between field values. and then > optionally qualify if the value has a text qualifier or not (e.g. > double-quotes around the value by default). Hence my request/suggestion > for a *TEXTQUALIFIER true/false option.* > > This is a BIG issue. (huge, if you are trying to migrate legacy system > data into neo4j and do daily updates while legacy systems are kept alive > and feeding updates to neo4j). > > Dave > > Always open to suggestion on this, but I have pounded my head on this > issue for the better part of a weekend trying to figure out what exactly is > causing my problems.... > > > > > On Sunday, November 2, 2014 5:58:35 PM UTC-5, Michael Hunger wrote: >> >> Does using just a doubled double quote "" or an escape quote \" help? >> >> The common CSV idioms indicate that as soon as you have special stuff in >> your text values like quotes or newlines and such, you _must_ quote them. >> >> Which tool created this unusual CSV format? >> >> Michael >> >> On Sun, Nov 2, 2014 at 10:52 PM, David Bigelow <davidh...@ >> simplifiedlogic.com> wrote: >> >>> This was a painful discovery. >>> >>> It appears that neo4j is EXPECTING Quoted content in a CSV file no >>> matter what. >>> >>> If you send data that does NOT have Text Qualifiers as an option from >>> another database source to generate your CSV File but DOES have Double >>> Quotes within the content, neo4j will consider the double-quote as the >>> start of a separated content REGARDLESS of your FIELDTERMINIATOR setting. >>> >>> For example (4 lines of text to be added to a single node property) >>> >>> ABC1203|1|Length: 2" from left >>> ABC1203|2|Textured >>> ABC1203|3|Depth: 6" from angle >>> ABC1203|4|Thickness: 8" from edge >>> ABC1203|5|Paint: 0.02" all around >>> >>> Might look like the following after import: >>> from left ABC1203|2|Textured ABC1203|3|Depth: 6 from edge >>> ABC1203|5|Paint: 0.02" all around >>> >>> >>> The problem appears to be that neo4j is expecting values to be contained >>> in Quotes and Properly Escaped: >>> ABC1203|1|"Length: 2\" from left" >>> ABC1203|2|"Textured" >>> ABC1203|3|"Depth: 6\" from angle" >>> ABC1203|4|"Depth: 8\" from edge" >>> ABC1203|5|"Paint: 0.02\" all around" >>> >>> This will get you more like this: >>> Length: 2" from left Height: 2" from base Depth: 6" from angle >>> Thickness: 8" from edge Paint: 0.02" all around >>> >>> >>> This is a bit of problem relative to importing CSV files... Not all >>> systems will write data out the same way for CSV. >>> >>> For example, sqlite3-shell will write out the above data like this - >>> notice how a single value is NOT Text Qualified, and it double escapes >>> Double Quotes. >>> >>> ABC1203|1|"Length: 2"" from left" >>> ABC1203|2|Textured >>> ABC1203|3|"Depth: 6"" from angle" >>> ABC1203|4|"Thickness: 8"" from edge" >>> ABC1203|5|"Paint: 0.02"" all around" >>> >>> *Proposed Solution:* >>> Include a *TEXTQUALIFIER* *true/false *option so that the default rule >>> for assuming double quotes around values can be disabled more easily and >>> also assume that content between the FIELDTERMINIATOR are qualified by the >>> FIELDTERMINIATOR that is specified by the user in the header of the CSV >>> import cypher. >>> >>> (note: this is a hacky example - I can come up with something more >>> definitive if necessary - but I have been fighting this quite a bit). >>> >>> Dave >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
