Re: Comma-delimited values

2010-03-10 Thread Gregory Lypny
Thanks everyone for your thoughtful comments on the comma-delimiter issue.  I 
have a couple of solutions in the works that are in the spirit of some of what 
has been posted.  For me the presence of quotation marks around some chunks or 
even within them (a literal quotation) adds to the challenge.

Regards,

Gregory
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-09 Thread Peter Haworth
I have to deal with csv files from several different external sources  
where csv is the only option and I have no control over the options  
used to create them.


I've run into most of the problems mentioned in this thread but they  
didn't seem that hard to solve.  I use the "inquotes" flag method  
mentioned earlier in this thread and go through character by character  
to parse the data into separate fields.  I'll admit that the files I  
have to deal with are not huge so I don't have to worry about  
performance issues but so far I haven;t run into any problems with  
that approach (let's hope it stays that way!)


Nevertheless, I'd be happy to drink a few beers at the funeral!!

Pete Haworth

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-09 Thread Martin Baxter
And drink many beers, and dance on the coffin.

Martin Baxter

stephen barncard wrote:
> We could have a funeral like IE6.
> -
> Stephen Barncard
> San Francisco
> 
> 
> On 8 March 2010 13:53, J. Landman Gay  wrote:
> 
>> Richard Gaskin wrote:
>>
>>  CSV must die.
>> Oh come on, Richard, tell us what you really think. :)
>>
>> --
>> Jacqueline Landman Gay | jac...@hyperactivesw.com
>> HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Shao Sean
I wrote a CSV import library a while back, let me dig it out and post  
it.. I ran it against some test CSV files and it seems to handle all  
the weird little quirks in CSV (double quotes, newlines, etc)..

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


RE: Comma-delimited values

2010-03-08 Thread Paul D. DeRocco
> From: wayne durden
>
> I hear your pain on the CSV issue.  However, wishing it would die I think
> deserves a more careful reflection.  Sometimes the devil you know is worse
> than the alternative...

It's only difficult to deal with CSV if you want to make a totally general
importer, with no foreknowledge of the type of data the files will contain.
But in most real-life situations, you'll never need to deal with a double
quote or newline in the data, so using newlines to separate records, and
double quotes to escape commas within records, is a perfectly workable
definition of the CSV format. It would never occur to anyone to use CSV to
represent records containing binary data, so it isn't much more of a
restriction to rule out control characters and double quotes. Your
application is a typical example; so is using CSV as an interchange format
for GPS logs.

--

Ciao,   Paul D. DeRocco
Paulmailto:pdero...@ix.netcom.com

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread stephen barncard
We could have a funeral like IE6.
-
Stephen Barncard
San Francisco


On 8 March 2010 13:53, J. Landman Gay  wrote:

> Richard Gaskin wrote:
>
>  CSV must die.
>>
>
> Oh come on, Richard, tell us what you really think. :)
>
> --
> Jacqueline Landman Gay | jac...@hyperactivesw.com
> HyperActive Software   | http://www.hyperactivesw.com
>
> ___
> use-revolution mailing list
> use-revolution@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread J. Landman Gay

Richard Gaskin wrote:


CSV must die.


Oh come on, Richard, tell us what you really think. :)

--
Jacqueline Landman Gay | jac...@hyperactivesw.com
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Peter Brigham MD
If you are stuck with trying to import from CSV format, here's a  
function that will convert any commas within quotes to another  
delimiter character.


function fixQuotedItems tText
   -- first make sure that you choose an escape char
   -- not contained in your data
   repeat for each item escapeChar in "§,•,ª,∞,™,º"
  -- or whatever list of odd characters you want
  if escapeChar is in tText then next repeat
  exit repeat
   end repeat
   put empty into adjText
   set the itemdelimiter to quote
   repeat for each line tLine in tText
  put 0 into counter
  put empty into adjustedLine
  repeat for each item i in tLine
 add 1 to counter
 if counter mod 2 = 1 then
put i after adjustedLine
 else
put i into temp
replace comma with escapeChar in temp
put temp after adjustedLine
 end if
  end repeat
  if char -1 of adjustedLine = comma then delete char -1 \
of adjustedLine
  put adjustedLine & cr after adjText
   end repeat
   if char -1 of adjText = cr then delete char -1 of adjText
   return adjText
end fixQuotedItems

this will take something like:

a,b,c,"1,2,3,4",d,e
"11,12,13",g,h,i
j,k,"22,23"

and return:

a,b,c,1§2§3§4,d,e
11§12§13,g,h,i
j,k,22§23

You can then parse your data and replace the escapeChar with comma  
after you're done.


-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig



On Mar 8, 2010, at 2:55 PM, Gregory Lypny wrote:

Thank you, Richmond.  Good stuff.  That would work here where only  
the first to-be item of each line is quoted because it has commas  
within it.  But if other files have quoted items in other locations  
(e.g., the fifth and ninth items), it would mean first identifying  
which chunks are quoted before I start converting from commas to tabs.


Gregory


On Mon, Mar 8 , 2010, at 1:00 PM, Richmond Mathewson wrote:



On 08/03/2010 19:44, Gregory Lypny wrote:

Hello everyone,

I'm creating an app that imports comma-delimited tables.  A few  
lines might look like this, where there are 14 items per line.


"Mon, Jan 18 , 2010",9:14 AM,130557,4319,Trade,Buy,X, 
135,8.25,10,-82.5,1417.5,20,10
"Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your ASK,X, 
135,8.25,10,82.5,1582.5,0,10


My problem is that Rev treats each date in quotes as three items  
rather than one.  I convert the comma delimiters to tab by setting  
the itemDelimiter to comma and then running the lines through  
nested repeat-for-each loops as


repeat for each line thisLine in dataTable
repeat for each item thisItem in thisLine
put thisItem&  tab after newLine
end repeat
-- more stuff here
end repeat

I end up with

"Mon   (as the first item)
Jan 18  (as the second)
2010"  (as the third)

Any suggestions as how I might get the date treated as one item?



Yes, although it is so goofily obvious you have probably thought  
about

this one and rejected it:

Change the commas for the bits inside the quotes to something else  
( ^

*  %) - dunno, any old
thing that isn't a comma . . . :)



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread wayne durden
Hi David, Richard et al.

I hear your pain on the CSV issue.  However, wishing it would die I think
deserves a more careful reflection.  Sometimes the devil you know is worse
than the alternative...

I use RunRev almost exclusively for parsing and processing datasets for
active traders.  CSV is one of the few remaining source formats that can be
easily resolved.  I use the flag method to handle quotes in the dataset...

The other alternatives my clients can provide are far more problematic ([XML
parsing - in theory good, in practice on large datasets there is almost
always some breakage in the dataset or the time to handle a large dataset is
exponentially greater than plain CSV ] [Excel - now virtually impossible to
deal with without a manual intervention and conversion])

So yes, CSV has it's problems.  Pipe delimited would suit me better.  But a
simple line based ordered dataset has proven in practice to be much more
usable and quicker to handle than fancier solutions...

Don't wish a CSV death on me please :)

Wayne


On Mon, Mar 8, 2010 at 2:51 PM, David Coker  wrote:

> On Mon, Mar 8, 2010 at 12:55 PM, Richard Gaskin
>  wrote:
> 
> CSV must die.
>
> Please help it die:  never write CSV exporters.
> 
>
> After my last foray into the world of CSV and the *many* problems I
> ran into, I decided to take Richard's little rant a step further...
>
> ...Not only will I not ever write a CSV exporter, but I will no longer
> attempt to support CSV, even as an import option for anything I build.
>
> That policy may make a few folks unhappy and/or possibly cost me a few
> sales should I go that route, but with certainty, it will not
> perpetuate the problem from either end in the future.
>
> I think I'd prefer the competition (if any), to have all of those
> support issues they want. ;-)
>
> Regards,
> David C.
> ___
> use-revolution mailing list
> use-revolution@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Gregory Lypny
Thank you, Richmond.  Good stuff.  That would work here where only the first 
to-be item of each line is quoted because it has commas within it.  But if 
other files have quoted items in other locations (e.g., the fifth and ninth 
items), it would mean first identifying which chunks are quoted before I start 
converting from commas to tabs.

Gregory


On Mon, Mar 8 , 2010, at 1:00 PM, Richmond Mathewson wrote:

> 
> On 08/03/2010 19:44, Gregory Lypny wrote:
>> Hello everyone,
>> 
>> I'm creating an app that imports comma-delimited tables.  A few lines might 
>> look like this, where there are 14 items per line.
>> 
>> "Mon, Jan 18 , 2010",9:14 
>> AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10
>> "Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your 
>> ASK,X,135,8.25,10,82.5,1582.5,0,10
>> 
>> My problem is that Rev treats each date in quotes as three items rather than 
>> one.  I convert the comma delimiters to tab by setting the itemDelimiter to 
>> comma and then running the lines through nested repeat-for-each loops as
>> 
>> repeat for each line thisLine in dataTable
>> repeat for each item thisItem in thisLine
>> put thisItem&  tab after newLine
>> end repeat
>> -- more stuff here
>> end repeat
>> 
>> I end up with
>> 
>> "Mon (as the first item)
>> Jan 18   (as the second)
>> 2010"(as the third)
>> 
>> Any suggestions as how I might get the date treated as one item?
>> 
>> 
>> 
> Yes, although it is so goofily obvious you have probably thought about 
> this one and rejected it:
> 
> Change the commas for the bits inside the quotes to something else ( ^  
> *  %) - dunno, any old
> thing that isn't a comma . . . :)
> 

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread David Coker
On Mon, Mar 8, 2010 at 12:55 PM, Richard Gaskin
 wrote:

CSV must die.

Please help it die:  never write CSV exporters.


After my last foray into the world of CSV and the *many* problems I
ran into, I decided to take Richard's little rant a step further...

...Not only will I not ever write a CSV exporter, but I will no longer
attempt to support CSV, even as an import option for anything I build.

That policy may make a few folks unhappy and/or possibly cost me a few
sales should I go that route, but with certainty, it will not
perpetuate the problem from either end in the future.

I think I'd prefer the competition (if any), to have all of those
support issues they want. ;-)

Regards,
David C.
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Bob Sneidar
Not to add fuel to the fire, but I have seen spreadsheet data containing tabs 
in cells before. Usually it's when importing from another program. Also, let's 
not forget that sometimes a cell can contain carriage returns too. 

This is the same problem that plagues SQL programmers, forcing them to escape 
their values before submitting a query. As it is the responsibility of SQL 
developers to take this precaution, it is also the responsibility of developers 
to properly format their exported data so that things like quotes, commas and 
carriage returns (delimiters) are "escaped" or converted to something else. 

Now Software used to do this pretty well, converting their notes fields 
containing commas and carriage returns to other special characters before 
exporting them. But eventually, you will hit this problem where the data is 
going to contain things you didn't expect, and you are going to have to deal 
with it, as in the case of the space after the day number in the original 
example. 

Bob


On Mar 8, 2010, at 11:19 AM, Jim Bufalini wrote:

> I agree with Richard's rant. CSV (Comma Separated Values) is a very, very
> old convention that was not well thought out from the beginning. Commas can
> exists within cells (values) as in the case of dates or text blocks or even
> formatted numbers ($1,000.00). So, to get around this, they added quotes
> around cells that could contain commas. The only problem with this, as
> Richard points out, is text blocks can also contain quotes. ;-)
> 
> This is why I said, whether you are exporting to import into Rev or any
> other program, use tab as the value separator and get rid of the arbitrary
> quotes that they only put in there because commas can exist in the cell.
> 
> Tab is very safe to use as a value separator from a spreadsheet export
> because if you press tab when editing a cell, in all spreadsheets I am aware
> of, the tab is not inserted into the cell, but instead jumps you to editing
> the next cell.
> 
> Aloha from Hawaii,
> 
> Jim Bufalini
> 
>> -Original Message-
>> From: use-revolution-boun...@lists.runrev.com [mailto:use-revolution-
>> boun...@lists.runrev.com] On Behalf Of Richard Gaskin
>> Sent: Monday, March 08, 2010 8:55 AM
>> To: How to use Revolution
>> Subject: Re: Comma-delimited values
>> 
>> Paul D. DeRocco wrote:
>>> Add an inQuotes flag, with an initial value of false. For each item,
>> if it
>>> has a quote in it, toggle the inQuotes flag. Then, if inQuotes is
>> set,
>>> append a comma instead of a tab, to put the item back together again.
>> 
>> Roger that.  CSV elements may contain returns within quoted portions,
>> and escaping uses a wide range of conventions differing from program to
>> program (within the Microsoft family I've seen it differ from version
>> to
>> version of the same program).  The flag method, while notoriously slow,
>> is the only reliable method I've found for handling all the many
>> variants of CSV.
>> 
>> 
>> A plea for sanity in the software development world:
>> 
>> While we need to write CSV importers from time to time, please never
>> write CSV exporters.
>> 
>> CSV must die.
>> 
>> The problem with CSV is that the comma is very commonly used in data,
>> making it a uniquely stupid choice as a delimiter.
>> 
>> That stupidity could be countered with consistent escaping, but no true
>> standard has emerged since the many decades of this productivity-abuse
>> began.
>> 
>> Making a stupid decision even more stupid, most CSV files quote
>> non-numeric values, as though the programmers did not realize the quote
>> character is commonly used in our language and therefore may likely be
>> part of the data. So using quote as an escape means that you must
>> escape
>> the escape.  Jeeminy who thinks up that merde?!?!
>> 
>> Sometimes the escape for in-data double-quotes is a double double-
>> quote,
>> which sometimes makes it hard to know what to do with empty values
>> shown
>> as "", esp. given that in some formats the empty value abuts the
>> adjacent commas and in others, like MySQL dumps, it abuts only the
>> trailing comma but has a space before the leading comma.  Other times
>> double-quotes are escaped with \", meaning you'll need to escape any
>> in-data backslashes too.
>> 
>> For thinking people, about the time you realize that you're escaping
>> the
>> escape that you've escaped to handle your data, it would occur to you
>> to
>> go back and check your original premise.  But not so for the creators
&g

RE: Comma-delimited values

2010-03-08 Thread Jim Bufalini
I agree with Richard's rant. CSV (Comma Separated Values) is a very, very
old convention that was not well thought out from the beginning. Commas can
exists within cells (values) as in the case of dates or text blocks or even
formatted numbers ($1,000.00). So, to get around this, they added quotes
around cells that could contain commas. The only problem with this, as
Richard points out, is text blocks can also contain quotes. ;-)

This is why I said, whether you are exporting to import into Rev or any
other program, use tab as the value separator and get rid of the arbitrary
quotes that they only put in there because commas can exist in the cell.

Tab is very safe to use as a value separator from a spreadsheet export
because if you press tab when editing a cell, in all spreadsheets I am aware
of, the tab is not inserted into the cell, but instead jumps you to editing
the next cell.

Aloha from Hawaii,

Jim Bufalini

> -Original Message-
> From: use-revolution-boun...@lists.runrev.com [mailto:use-revolution-
> boun...@lists.runrev.com] On Behalf Of Richard Gaskin
> Sent: Monday, March 08, 2010 8:55 AM
> To: How to use Revolution
> Subject: Re: Comma-delimited values
> 
> Paul D. DeRocco wrote:
> > Add an inQuotes flag, with an initial value of false. For each item,
> if it
> > has a quote in it, toggle the inQuotes flag. Then, if inQuotes is
> set,
> > append a comma instead of a tab, to put the item back together again.
> 
> Roger that.  CSV elements may contain returns within quoted portions,
> and escaping uses a wide range of conventions differing from program to
> program (within the Microsoft family I've seen it differ from version
> to
> version of the same program).  The flag method, while notoriously slow,
> is the only reliable method I've found for handling all the many
> variants of CSV.
> 
> 
> A plea for sanity in the software development world:
> 
> While we need to write CSV importers from time to time, please never
> write CSV exporters.
> 
> CSV must die.
> 
> The problem with CSV is that the comma is very commonly used in data,
> making it a uniquely stupid choice as a delimiter.
> 
> That stupidity could be countered with consistent escaping, but no true
> standard has emerged since the many decades of this productivity-abuse
> began.
> 
> Making a stupid decision even more stupid, most CSV files quote
> non-numeric values, as though the programmers did not realize the quote
> character is commonly used in our language and therefore may likely be
> part of the data. So using quote as an escape means that you must
> escape
> the escape.  Jeeminy who thinks up that merde?!?!
> 
> Sometimes the escape for in-data double-quotes is a double double-
> quote,
> which sometimes makes it hard to know what to do with empty values
> shown
> as "", esp. given that in some formats the empty value abuts the
> adjacent commas and in others, like MySQL dumps, it abuts only the
> trailing comma but has a space before the leading comma.  Other times
> double-quotes are escaped with \", meaning you'll need to escape any
> in-data backslashes too.
> 
> For thinking people, about the time you realize that you're escaping
> the
> escape that you've escaped to handle your data, it would occur to you
> to
> go back and check your original premise.  But not so for the creators
> of
> CSV.
> 
> As Jim Bufalini pointed out, tab-delimited or even (in fewer cases)
> pipe-delimited make much saner options.
> 
> For my own delimited exports I've adopted the convention used in
> FileMaker Pro and others, with escapes that are unlikely to be in the
> data:
> 
> - records are delimited with returns
> - fields delimited with tabs
> - quotes are never added and are included only when they are part
>of the data
> - any in-data returns are escaped with ASCII 11
> - any in-data tabs escaped with ASCII 4
> 
> Simple to write, lightning-fast to parse.
> 
> When you add up all the programmer- and end-user-hours lost to dealing
> with the uniquely stupid collection of mish-mashed ad-hoc formats that
> is CSV, it amounts to nothing less than a crime against humanity.
> Several hundred if not thousands of human lifetimes have been lost
> either dealing with bugs related to CSV parsers, or simply waiting for
> the inherently slow parsing of CSV that could have taken mere
> milliseconds if done in any saner format.
> 
> CSV must die.
> 
> Please help it die:  never write CSV exporters.
> 
> 
> --
>   Richard Gaskin
>   Fourth World
>   Rev training and consulting: http://www.fourthworld.com
>   Webzine for Rev developers: http://www.revjournal.com
>   revJournal blog:

Re: Comma-delimited values

2010-03-08 Thread Bob Sneidar
 sending again due to the "email too big" problem. 

> I think the operative word in the original post was "might" as in, "might 
> look like this". I gather that it might also not. Is the question, how do you 
> blindly detect a date format? In that case, you are going to need a function 
> that analyzes a string to determine if any of the data between quotes is a 
> date. 
> 
> I might suggest: (as always watch for line wraps)
> 
> on mouseUp pMouseBtnNo
> put quote & "Mon, Jan 18 , 2010" & quote & ",9:14 
> AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10" into myVar
> set the itemdelimiter to quote
> repeat for each item theString in myVar
> if theString is a date then put true -- or whatever else you want to 
> do at this point. 
> end repeat
> end mouseUp
> 
> Two very obvious problems will immediately present themselves:  First, in the 
> example text given, nothing is a date! Why you ask? Because there is a space 
> in the date after the day number, which is enough to convince Rev that the 
> string is not, in fact, a date. But that may be a typo, and the actual data 
> may correct this anomaly. The second thing is, what if the text has no 
> quotes? 
> 
> All this underscores the fact that you need to understand the nature of the 
> information before you can provide a solution. If it is going to be true CSV, 
> then the dates and all non-numeric data will (or should) be enclosed in 
> quotes.  
> 
> One last caution: When using this form of repeat, remember that you cannot 
> alter the contents of myVar while the loop is running because of the way the 
> repeat command handles the variable. 
> 
> So a revised function that does what you want could look like this: 
> 
> on mouseUp pMouseBtnNo
> breakpoint
> put quote & "Mon, Jan 18, 2010" & quote & ",9:14 
> AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10" into myVar
> set the itemdelimiter to quote
> put myVar into tempVar
> repeat for each item theString in myVar
> if theString is a date then
> put the length of theString into theStrLength
> put offset(theString, tempVar) into charStart
> put charStart + theStrLength into charEnd
> replace comma with "*&*" in char charStart to charEnd of tempVar
> end if
> end repeat
> replace comma with return in tempVar
> replace "*&*" with comma in tempVar
> put tempVar into myVar
> end mouseUp
> 
> resulting in data that looks like this:
> 
> "Mon, Jan 18, 2010"
> 9:14 AM
> 130557
> 4319
> Trade
> Buy
> X
> 135
> 8.25
> 10
> -82.5
> 1417.5
> 20
> 10
> 
> Bob
> 
> 
> On Mar 8, 2010, at 9:44 AM, Gregory Lypny wrote:
> 
>> Hello everyone,
>> 
>> I'm creating an app that imports comma-delimited tables.  A few lines might 
>> look like this, where there are 14 items per line.
>> 
>> "Mon, Jan 18 , 2010",9:14 
>> AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10
>> "Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your 
>> ASK,X,135,8.25,10,82.5,1582.5,0,10
>> 
>> My problem is that Rev treats each date in quotes as three items rather than 
>> one.  I convert the comma delimiters to tab by setting the itemDelimiter to 
>> comma and then running the lines through nested repeat-for-each loops as
>> 
>> repeat for each line thisLine in dataTable
>> repeat for each item thisItem in thisLine
>> put thisItem & tab after newLine
>> end repeat
>> -- more stuff here
>> end repeat
>> 
>> I end up with
>> 
>> "Mon (as the first item)
>> Jan 18   (as the second)
>> 2010"(as the third)
>> 
>> Any suggestions as how I might get the date treated as one item?
>> 
>> Regards,
>> 
>>  Gregory
>> ___
>> use-revolution mailing list
>> use-revolution@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription 
>> preferences:
>> http://lists.runrev.com/mailman/listinfo/use-revolution
> 
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Richard Gaskin

Paul D. DeRocco wrote:

Add an inQuotes flag, with an initial value of false. For each item, if it
has a quote in it, toggle the inQuotes flag. Then, if inQuotes is set,
append a comma instead of a tab, to put the item back together again.


Roger that.  CSV elements may contain returns within quoted portions, 
and escaping uses a wide range of conventions differing from program to 
program (within the Microsoft family I've seen it differ from version to 
version of the same program).  The flag method, while notoriously slow, 
is the only reliable method I've found for handling all the many 
variants of CSV.



A plea for sanity in the software development world:

While we need to write CSV importers from time to time, please never 
write CSV exporters.


CSV must die.

The problem with CSV is that the comma is very commonly used in data, 
making it a uniquely stupid choice as a delimiter.


That stupidity could be countered with consistent escaping, but no true 
standard has emerged since the many decades of this productivity-abuse 
began.


Making a stupid decision even more stupid, most CSV files quote 
non-numeric values, as though the programmers did not realize the quote 
character is commonly used in our language and therefore may likely be 
part of the data. So using quote as an escape means that you must escape 
the escape.  Jeeminy who thinks up that merde?!?!


Sometimes the escape for in-data double-quotes is a double double-quote, 
which sometimes makes it hard to know what to do with empty values shown 
as "", esp. given that in some formats the empty value abuts the 
adjacent commas and in others, like MySQL dumps, it abuts only the 
trailing comma but has a space before the leading comma.  Other times 
double-quotes are escaped with \", meaning you'll need to escape any 
in-data backslashes too.


For thinking people, about the time you realize that you're escaping the 
escape that you've escaped to handle your data, it would occur to you to 
go back and check your original premise.  But not so for the creators of 
CSV.


As Jim Bufalini pointed out, tab-delimited or even (in fewer cases) 
pipe-delimited make much saner options.


For my own delimited exports I've adopted the convention used in 
FileMaker Pro and others, with escapes that are unlikely to be in the 
data:


- records are delimited with returns
- fields delimited with tabs
- quotes are never added and are included only when they are part
  of the data
- any in-data returns are escaped with ASCII 11
- any in-data tabs escaped with ASCII 4

Simple to write, lightning-fast to parse.

When you add up all the programmer- and end-user-hours lost to dealing 
with the uniquely stupid collection of mish-mashed ad-hoc formats that 
is CSV, it amounts to nothing less than a crime against humanity. 
Several hundred if not thousands of human lifetimes have been lost 
either dealing with bugs related to CSV parsers, or simply waiting for 
the inherently slow parsing of CSV that could have taken mere 
milliseconds if done in any saner format.


CSV must die.

Please help it die:  never write CSV exporters.


--
 Richard Gaskin
 Fourth World
 Rev training and consulting: http://www.fourthworld.com
 Webzine for Rev developers: http://www.revjournal.com
 revJournal blog: http://revjournal.com/blog.irv
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread DunbarX
Here is an easy way:

on mouseUp
put yourdata into temp
repeat for each line tLine in yourdata
   replace comma with tab in item 3 to 50 of tLine
end repeat
put yourdata into whereEverYouWant
end mouseUp

This should be pretty readable.

Craig Newman

In a message dated 3/8/10 12:44:39 PM, gregory.ly...@videotron.ca writes:


> "Mon, Jan 18 , 2010",9:14 
> AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10
> 
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Fredrik Andersson

Hi,

I'd do it something like this (since I love using set itemdel...):

repeat for each line thisLine in dataTable
set the itemdel to quote
put the second item of thisLine & tab into newLine -- should be the 
date

put the third item of thisLine into thisLine -- should be the rest
set the itemdel to comma
repeat for each item thisItem in thisLine
put thisItem & tab after newLine
end repeat

-- more stuff here

end repeat

Cheers,

Fredrik

Gregory Lypny skrev 2010-03-08 18.44:

Hello everyone,

I'm creating an app that imports comma-delimited tables.  A few lines might 
look like this, where there are 14 items per line.

"Mon, Jan 18 , 2010",9:14 
AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10
"Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your 
ASK,X,135,8.25,10,82.5,1582.5,0,10

My problem is that Rev treats each date in quotes as three items rather than 
one.  I convert the comma delimiters to tab by setting the itemDelimiter to 
comma and then running the lines through nested repeat-for-each loops as

repeat for each line thisLine in dataTable
repeat for each item thisItem in thisLine
put thisItem&  tab after newLine
end repeat
-- more stuff here
end repeat

I end up with

"Mon   (as the first item)
Jan 18  (as the second)
2010"  (as the third)

Any suggestions as how I might get the date treated as one item?

Regards,

Gregory
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution
   


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


RE: Comma-delimited values

2010-03-08 Thread Jim Bufalini
Gregory Lypny wrote:

> Hello everyone,
> 
> I'm creating an app that imports comma-delimited tables.  A few lines
> might look like this, where there are 14 items per line.
> 
> "Mon, Jan 18 , 2010",9:14 AM,130557,4319,Trade,Buy,X,135,8.25,10,-
> 82.5,1417.5,20,10
> "Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your
> ASK,X,135,8.25,10,82.5,1582.5,0,10
> 
> My problem is that Rev treats each date in quotes as three items rather
> than one.  I convert the comma delimiters to tab by setting the
> itemDelimiter to comma and then running the lines through nested
> repeat-for-each loops as
> 
> repeat for each line thisLine in dataTable
> repeat for each item thisItem in thisLine
> put thisItem & tab after newLine
> end repeat
> -- more stuff here
> end repeat
> 
> I end up with
> 
> "Mon  (as the first item)
> Jan 18(as the second)
> 2010" (as the third)
> 
> Any suggestions as how I might get the date treated as one item?

Almost all programs like spreadsheets that export tables in CSV format have
two options that you can set:

1. You can use tab instead of comma as the value delimiter.
2. You can tell it not to use quotes around text and dates.

So, export the tables in tab delimited format and without quotes. Then, when
importing into Rev, *set itemdel to tab*. If your spreadsheet program
insists on quotes, then *replace quote with empty in tImportedData* after
importing the data into rev.

Then you can use items and lines to work with the imported data in rev
without necessity of any data manipulations.

Aloha from Hawaii,

Jim Bufalini

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


RE: Comma-delimited values

2010-03-08 Thread Paul D. DeRocco
> On 08/03/2010 19:44, Gregory Lypny wrote:
>
> I'm creating an app that imports comma-delimited tables.  A few
lines might look like this, where there are 14 items per line.
>
> "Mon, Jan 18 , 2010",9:14
AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10
> "Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your
ASK,X,135,8.25,10,82.5,1582.5,0,10
>
> My problem is that Rev treats each date in quotes as three
> items rather than one.  I convert the comma delimiters to tab by
> setting the itemDelimiter to comma and then running the lines
> through nested repeat-for-each loops as
>
> repeat for each line thisLine in dataTable
> repeat for each item thisItem in thisLine
> put thisItem&  tab after newLine
> end repeat
> -- more stuff here
> end repeat
>
> I end up with
>
> "Mon  (as the first item)
> Jan 18(as the second)
> 2010" (as the third)
>
> Any suggestions as how I might get the date treated as one item?

Add an inQuotes flag, with an initial value of false. For each item, if it
has a quote in it, toggle the inQuotes flag. Then, if inQuotes is set,
append a comma instead of a tab, to put the item back together again.

--

Ciao,   Paul D. DeRocco
Paulmailto:pdero...@ix.netcom.com

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Mark Schonewille

Hi Gregory,

Does this work for you?

function csv2tab theData
 repeat for each line myLine in theData
  repeat for each word myWord in myLine
   if char 1 of myWord is quote and (char -2 of myWord is  
quote or char -1 of myWord is quote) then

replace comma with "<\#>" in myWord
   end if
   put myWord after myNewData
  end repeat
  put cr after myNewData
 end repeat
 replace comma with tab in myNewData
 replace "<\#>" with comma in myNewData
 return myNewData
end csv2tab

--
Best regards,

Mark Schonewille

Economy-x-Talk Consulting and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer

Economy-x-Talk is always looking for new software development  
projects. Feel free to contact me for a quote.


Op 8 mrt 2010, om 18:44 heeft Gregory Lypny het volgende geschreven:


Hello everyone,

I'm creating an app that imports comma-delimited tables.  A few  
lines might look like this, where there are 14 items per line.


"Mon, Jan 18 , 2010",9:14 AM,130557,4319,Trade,Buy,X, 
135,8.25,10,-82.5,1417.5,20,10
"Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your ASK,X, 
135,8.25,10,82.5,1582.5,0,10


My problem is that Rev treats each date in quotes as three items  
rather than one.  I convert the comma delimiters to tab by setting  
the itemDelimiter to comma and then running the lines through nested  
repeat-for-each loops as


repeat for each line thisLine in dataTable
repeat for each item thisItem in thisLine
put thisItem & tab after newLine
end repeat
-- more stuff here
end repeat

I end up with

"Mon   (as the first item)
Jan 18  (as the second)
2010"  (as the third)

Any suggestions as how I might get the date treated as one item?

Regards,

Gregory


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Jeff Massung
Couple ideas for you:

Loop over each word in the line and if the word doesn't have quotes, replace
comma with tab:

repeat with tWord = 1 to the number of words in tLine
   if the first char of word tWord in tLine is not quote then
  replace comma with tab in word tWord of tLine
   end if
end repeat

Use a regex or another search function to find quotes and skip over them,
replacing commas with tabs with for everything in between.

If you have access to the input data (read: from a db query or some-such),
just modify it at the source.

If your quoted data is always dates you could change the dates to something
else. For example:

repeat with tIdx = 1 to the number of words in tLine
   if word tIdx of tLine is a date then
  convert word tIdx of tLine to seconds
   end if
end repeat

And now you can do other kinds of operations on the date or even convert
them back to dates later on.

I wouldn't try doing anything goofy like trying to change commas within
quotes to something else and then change them back. It'll lead to bugs later
on I'm quite sure.

This is the one thing about Rev I wish I could change... quotes should wrap
items just as they wrap words.

Jeff M.

On Mon, Mar 8, 2010 at 11:44 AM, Gregory Lypny
wrote:

> Hello everyone,
>
> I'm creating an app that imports comma-delimited tables.  A few lines might
> look like this, where there are 14 items per line.
>
> "Mon, Jan 18 , 2010",9:14
> AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10
> "Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your
> ASK,X,135,8.25,10,82.5,1582.5,0,10
>
> My problem is that Rev treats each date in quotes as three items rather
> than one.  I convert the comma delimiters to tab by setting the
> itemDelimiter to comma and then running the lines through nested
> repeat-for-each loops as
>
> repeat for each line thisLine in dataTable
> repeat for each item thisItem in thisLine
> put thisItem & tab after newLine
> end repeat
> -- more stuff here
> end repeat
>
> I end up with
>
> "Mon(as the first item)
> Jan 18  (as the second)
> 2010"   (as the third)
>
> Any suggestions as how I might get the date treated as one item?
>
> Regards,
>
>Gregory
> ___
> use-revolution mailing list
> use-revolution@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Comma-delimited values

2010-03-08 Thread Richmond Mathewson

On 08/03/2010 19:44, Gregory Lypny wrote:

Hello everyone,

I'm creating an app that imports comma-delimited tables.  A few lines might 
look like this, where there are 14 items per line.

"Mon, Jan 18 , 2010",9:14 
AM,130557,4319,Trade,Buy,X,135,8.25,10,-82.5,1417.5,20,10
"Mon, Jan 18 , 2010",9:14 AM,130558,4371,Accept,Your 
ASK,X,135,8.25,10,82.5,1582.5,0,10

My problem is that Rev treats each date in quotes as three items rather than 
one.  I convert the comma delimiters to tab by setting the itemDelimiter to 
comma and then running the lines through nested repeat-for-each loops as

repeat for each line thisLine in dataTable
repeat for each item thisItem in thisLine
put thisItem&  tab after newLine
end repeat
-- more stuff here
end repeat

I end up with

"Mon   (as the first item)
Jan 18  (as the second)
2010"  (as the third)

Any suggestions as how I might get the date treated as one item?


   
Yes, although it is so goofily obvious you have probably thought about 
this one and rejected it:


Change the commas for the bits inside the quotes to something else ( ^  
*  %) - dunno, any old

thing that isn't a comma . . . :)

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution