A note of caution when using this.
The chunking rules for Rev are that a line can contain items, but items
cannot contain lines.
Lines can contain items, items contain words.
wrong = line 6 of item 4, no matter the delimiter
You can get past this by adding parentheses:
item 1 of word 2
Hello folks,
While this message isn't specifically about a Rev solution per se, I think that
some of you may still find it helpful. After a lot of research on the net, I
finally found a one-two punch to help clean up this large mass of CSV data
that I've been working with. As always, I
Great work, David. Thanks for keeping us up to date on this.
Tom McGrath III
Lazy River Software
3mcgr...@comcast.net
iTunes Library Suite - libITS
Information and download can be found on this page:
http://www.lazyriversoftware.com/RevOne.html
On Dec 8, 2009, at 7:53 AM, David Coker wrote:
David,
I've just imported 30 MB worth of text into a SQLite database over a remote
connection. What I tried using first was a CSV file as well and I could not
make it work. I am importing magazine articles, the comma count is all wrong
and unless I wrote a super dupper RegEx thing to cope with
Andre,
When I first had headaches working with csv I ended up, in Excel prior to
exporting, doing a 'Replace All'
, to c
' to sqsqsq (single quote)
to dqdqdq (double quote)
tab to t
carriage returns to crcrcr
the basic principle being replacing all problem characters with a character
I'm sure others have known about this but I feel like someone just
turned on a giant light bulb in my head. This is the single best thing
I have learned this year...
set the linedelimiter to comma
set the itemdelimiter to quote
I am just amazed...
Tom McGrath III
Lazy River Software
David,
Prior to exporting from Excel, do a Repace All 'tabs' with and
replace all line breaks with lblblblb (obviously you can use any number of
and combination of characters that you are confident will never appear in
the data). Then export it as tab delimited.
Every line of the resulting
On Dec 5, 2009, at 5:24 AM, Thomas McGrath III wrote:
I'm sure others have known about this but I feel like someone just
turned on a giant light bulb in my head. This is the single best
thing I have learned this year...
set the linedelimiter to comma
set the itemdelimiter to quote
I am
A note of caution when using this.
The chunking rules for Rev are that a line can contain items, but
items cannot contain lines.
Lines can contain items, items contain words.
wrong = line 6 of item 4, no matter the delimiter
Not all functions obey a new setting.
Filter does not follow the
Hello folks,
I'm in the planning stages of a possible new app which will include populating
a Rev Database (SQLite) primarily from a standard Excel based CSV file. What
I've run into while doing some research is that that format seems leaves a lot
to be desired. It seems that the CSV data that
On 12/4/09 10:46 PM, David Coker davidoco...@gmail.com wrote:
Hi David,
To fight with line breaks, exists such tip
As to specify as Field and Line delimiters some odd chars
µ
At least Valentina and mySQL allow this.
If SqlLite cannot, you can try to use above dbs as intermediate step.
normally, csv is a pain in the ass, but Rev tokens trump that ;)
you can set the linedelimiter to comma, and the itemdelimter to quote.
then you could do something similar to this (great for not having to
make special cases for the particular field being empty, lacking
quotes, etc.):
set
normally, csv is a pain in the ass, but Rev tokens trump that ;)
you can set the linedelimiter to comma, and the itemdelimter to quote.
then you could do something similar to this (great for not having to
make special cases for the particular field being empty, lacking
quotes, etc.):
set
David Coker wrote:
set the linedelimiter to comma
set the itemdelimiter to quote
repeat for each line theLine in theCSV
put item 1 to -1 of line theLine into theData
--do stuff with data here
end repeat
That is an awesome idea, well worth pursuing...
Be careful if there are embedded
Hi David,
To fight with line breaks, exists such tip
As to specify as Field and Line delimiters some odd chars
„ µ
At least Valentina and mySQL allow this.
If SqlLite cannot, you can try to use above dbs as intermediate step.
Hello Ruslan,
Thank you for you're reply.
If I understand
Be careful if there are embedded commas inside each quoted section. It
will fail for cases like this:
one,two,three,four
Ouch!
The test files I've been working with are chock full of such.
The data comes for numerous sources, going back something like ten years,
having been produced by
Depending on how big the files are and how quickly they need to be
processed, I would recommend you build a scanner that toggles at least
two flags.
Flag 1 is flgQuoteRunOn as true or false
Flag 2 is flgEscapeChar as true or false
put false into flgQuoteRunOn
put false into flgEscapeChar
Jim Ault wrote:
Just so you know CSV is the second worst format ever invented.
They are still searching for worst one, but have not found it yet.
ROTFL!
There was a huge discussion a couple of years ago on this list about the
best way to parse CSV. There weren't any perfect solutions, but
Depending on how big the files are and how quickly they need to be
processed, I would recommend you build a scanner that toggles at least
two flags.
Flag 1 is flgQuoteRunOn as true or false
Flag 2 is flgEscapeChar as true or false
put false into flgQuoteRunOn
put false into flgEscapeChar
put
Hi Richard!
If you have it in Excel, can you export it using tab-delimited?
Yes sir, been there and tried that several different ways. Unfortunately, that
creates issues of a different sort. As a really poor example, this is one of
the things I continue to run across after converting to tabbed
David Coker wrote:
Hi Richard!
If you have it in Excel, can you export it using tab-delimited?
Yes sir, been there and tried that several different ways.
Unfortunately, that creates issues of a different sort. As a really
poor example, this is one of the things I continue to run across
21 matches
Mail list logo