subject:"CSV again."

Well, there goes that idea.  There are tutorials right on Git, but it might
be easier if you (and anyone else so not-inclined to Git) post here and
those of us who are at least inclined to try will make do with doing that
work for you.

Anyway, here's what I have as the latest version, with a couple of things I
added to it, marked as "mikey"

function CSVToTab pData,pcoldelim
  local tNuData -- contains tabbed copy of data
  local tReturnPlaceholder -- replaces cr in field data to avoid line
  --   breaks which would be misread as records;
  local tNuDelim  -- new character to replace the delimiter
  local tStatus, theInsideStringSoFar
  --
  put numtochar(11) into tReturnPlaceholder -- vertical tab as
placeholder
  put numtochar(29) into tNuDelim
  --
  if pcoldelim is empty then put comma into pcoldelim
  -- Normalize line endings:
  replace crlf with cr in pData  -- Win to UNIX
  replace numtochar(13) with cr in pData -- Mac to UNIX

  put CR after pData #last line may not properly parse, otherwise #mikey

  put "outside" into tStatus
  set the itemdel to quote
  repeat for each item k in pData
-- put tStatus && k & CR after msg
switch tStatus

  case "inside"
put k after theInsideStringSoFar
put "passedquote" into tStatus
next repeat

  case "passedquote"
-- decide if it was a duplicated escapedQuote or a
closing quote
if k is empty then   -- it's a duplicated quote
  put quote after theInsideStringSoFar
  put "inside" into tStatus
  next repeat
end if
-- not empty - so we remain inside the cell, though
we have left the quoted section
-- NB this allows for quoted sub-strings within the
cell content !!
replace cr with tReturnPlaceholder in
theInsideStringSoFar
put theInsideStringSoFar after tNuData

  case "outside"
replace pcoldelim with tNuDelim in k
-- and deal with the "empty trailing item" issue in
Livecode
replace (tNuDelim & CR) with tNuDelim & tNuDelim &
CR in k
put k after tNuData
put "inside" into tStatus
put empty into theInsideStringSoFar
next repeat
  default
put "defaulted"
break
end switch
  end repeat
  replace tNuDelim with tab in tNuData #mikey
  delete last char of tNuData #added at top to assist last line parse
#mikey
  return tNuData
end CSVToTab




On Sun, Oct 18, 2015 at 8:06 PM, Alex Tweedly  wrote:

>
>
> On 18/10/2015 13:57, Mike Kerner wrote:
>
>> https://github.com/macMikey/LiveCode-Libraries/tree/master/csv
>>
>> I've found some corner cases and made some others.
>>
>>
>> OK, I confess:
>
> I've never used git or github, and I have no idea how to get access to
> these.  :-)
>
> I know I need to learn, but honestly this is not the right time for me to
> do that - is there a 5-minute tutorial (or step-by-step instruction) that I
> can follow to at least get these files ?
>
> Many thanks
> Alex.
>
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

2015-10-18 Thread Alex Tweedly




On 18/10/2015 13:57, Mike Kerner wrote:

https://github.com/macMikey/LiveCode-Libraries/tree/master/csv

I've found some corner cases and made some others.



OK, I confess:

I've never used git or github, and I have no idea how to get access to 
these.  :-)


I know I need to learn, but honestly this is not the right time for me 
to do that - is there a 5-minute tutorial (or step-by-step instruction) 
that I can follow to at least get these files ?


Many thanks
Alex.

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

2015-10-18 Thread Alex Tweedly




On 18/10/2015 03:17, Peter M. Brigham wrote:

At this point, finding a function that does the task at all -- reliably and 
taking into account most of the csv malformations we can anticipate -- would be 
a start. So far nothing has been unbreakable. Once we find an algorithm that 
does the job, we can focus on speeding it up.


That is indeed the issue.

There are two distinct problems, and the "best" solutions for each may 
be different.


1. Optimistic parser.

Properly parse any well-formed CSV data, in any idiosyncratic dialect of 
CSV that we may be interested in.


Or to put it otherwise, in general we are going to be parsing data 
produced by some program - it may take some oddball approach to CSV 
formatting, but it will be "correct" in the program's own terms. We are 
not (in this problem) trying to handle, e.g., hand-generated files that 
may contain errors, or have deliberate errors embedded. Thus, we do not 
expect things like mis-matched quotes, etc. - and it will be adequate to 
do "something reasonable" given bad input data.


2. Pessimistic parser.

Just the opposite - try to detect any arbitrary malformation with a 
sensible error message, and properly parse any well-formed CSV data in 
any dialect we might encounter.


And common to both
- adequate (optional) control over delimiters, escaped characters in the 
output, etc.

- efficiency (speed) matters

IMHO, we should also specify that the output should
 - remove the enclosing quotes from quoted cells
 - reduce doubled-quotes within a quoted cell to the appropriate single 
instance of a quote
in order that the TSV (or array, or whatever output format is chosen) 
does not need further processing to remove them; i.e. the output data is 
clean of any CSV formatting artifacts.


Personally, I am a pragmatist, and I have always needed solution 1 above 
- whenever I've had to parse CSV data, it's because I had a real-world 
need to do so, and the data was coming from some well-behaved (even if 
very weird) application - so it was consistent and followed some kind of 
rules, however wacky those rules might be. Other people may have 
different needs.


So I believe that any proposed algorithm should be clear about which of 
these two distinct problems it is trying to solve, and should be judged 
accordingly. Then each of us can look for the most efficient solution to 
whichever one they most care about.


I do believe that any solution to problem 2 is also a solution to 
problem 1 - but I don't know if it can be as efficient while tackling 
that harder problem.


-- Alex.



___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

Consider them added.  They're called "Richard-1.csv" and "Richard-2.csv"

On Sun, Oct 18, 2015 at 6:46 PM, Richard Gaskin 
wrote:

> Mike Kerner wrote:
>
>> I don't have a corner case file, yet, but I'm going to start adding one to
>> Git in a minute...
>>
>> On Sun, Oct 18, 2015 at 2:26 AM, Kay C Lan 
>> wrote:
>>
>> On Sun, Oct 18, 2015 at 10:17 AM, Peter M. Brigham 
>>> wrote:
>>>
>>> > At this point, finding a function that does the task at all -- reliably
>>> > and taking into account most of the csv malformations we can anticipate
>>> --
>>> > would be a start.
>>>
>>
> The snippet included in my article is commonly used to test CSV parsers,
> which is unfortunate since it only covers a relatively small handful of
> edge cases - I added a case for in-data returns just below it:
>
> 
>
> Even then woefully incomplete, but hopefully worthwhile as at least a
> starting point.
>
> --
>  Richard Gaskin
>  Fourth World Systems
>  Software Design and Development for the Desktop, Mobile, and the Web
>  
>  ambassa...@fourthworld.comhttp://www.FourthWorld.com
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

2015-10-18 Thread Richard Gaskin


Mike Kerner wrote:

I don't have a corner case file, yet, but I'm going to start adding one to
Git in a minute...

On Sun, Oct 18, 2015 at 2:26 AM, Kay C Lan  wrote:


On Sun, Oct 18, 2015 at 10:17 AM, Peter M. Brigham 
wrote:

> At this point, finding a function that does the task at all -- reliably
> and taking into account most of the csv malformations we can anticipate
--
> would be a start.


The snippet included in my article is commonly used to test CSV parsers, 
which is unfortunate since it only covers a relatively small handful of 
edge cases - I added a case for in-data returns just below it:




Even then woefully incomplete, but hopefully worthwhile as at least a 
starting point.


--
 Richard Gaskin
 Fourth World Systems
 Software Design and Development for the Desktop, Mobile, and the Web
 
 ambassa...@fourthworld.comhttp://www.FourthWorld.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

https://github.com/macMikey/LiveCode-Libraries/tree/master/csv

I've found some corner cases and made some others.

On Sun, Oct 18, 2015 at 8:01 AM, Mike Kerner 
wrote:

> I don't have a corner case file, yet, but I'm going to start adding one to
> Git in a minute...
>
> On Sun, Oct 18, 2015 at 2:26 AM, Kay C Lan 
> wrote:
>
>> On Sun, Oct 18, 2015 at 10:17 AM, Peter M. Brigham 
>> wrote:
>>
>> > At this point, finding a function that does the task at all -- reliably
>> > and taking into account most of the csv malformations we can anticipate
>> --
>> > would be a start.
>>
>>
>> Actually, having a standard mutant csv file to work on would be a start.
>> Probably two files, a plain text file that needs to fed fed into the algo
>> and a pdf version which shows exactly how the data is suppose to appear.
>>
>> What are people using for their example file? We need to check it contains
>> all possible mutations.
>> ___
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>
>
> --
> On the first day, God created the heavens and the Earth
> On the second day, God created the oceans.
> On the third day, God put the animals on hold for a few hours,
>and did a little diving.
> And God said, "This is good."
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

I don't have a corner case file, yet, but I'm going to start adding one to
Git in a minute...

On Sun, Oct 18, 2015 at 2:26 AM, Kay C Lan  wrote:

> On Sun, Oct 18, 2015 at 10:17 AM, Peter M. Brigham 
> wrote:
>
> > At this point, finding a function that does the task at all -- reliably
> > and taking into account most of the csv malformations we can anticipate
> --
> > would be a start.
>
>
> Actually, having a standard mutant csv file to work on would be a start.
> Probably two files, a plain text file that needs to fed fed into the algo
> and a pdf version which shows exactly how the data is suppose to appear.
>
> What are people using for their example file? We need to check it contains
> all possible mutations.
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

2015-10-17 Thread Kay C Lan

On Sun, Oct 18, 2015 at 10:17 AM, Peter M. Brigham  wrote:

> At this point, finding a function that does the task at all -- reliably
> and taking into account most of the csv malformations we can anticipate --
> would be a start.

Actually, having a standard mutant csv file to work on would be a start.
Probably two files, a plain text file that needs to fed fed into the algo
and a pdf version which shows exactly how the data is suppose to appear.

What are people using for their example file? We need to check it contains
all possible mutations.
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

Peter,
You're absolutely right, of course.  While we're at it, it would be
interesting to see what we come up with if we write it for LCB's modules...

On Sat, Oct 17, 2015 at 10:17 PM, Peter M. Brigham  wrote:

> At this point, finding a function that does the task at all -- reliably
> and taking into account most of the csv malformations we can anticipate --
> would be a start. So far nothing has been unbreakable. Once we find an
> algorithm that does the job, we can focus on speeding it up.
>
> That said, I don't know that my solution is optimized for speed very well.
> It takes 4-5 seconds to process a 986 record file. On an old slow machine,
> a 2008 MacBook 2.1 GHz Intel Core Duo, but still….
>
> -- Peter
>
> Peter M. Brigham
> pmb...@gmail.com
> http://home.comcast.net/~pmbrig
>
> On Oct 17, 2015, at 10:05 PM, Mike Kerner wrote:
>
> > The other thing that we are going to be interested in is finding the
> > fastest function that performs the task.
> >
> > On Sat, Oct 17, 2015 at 10:04 PM, Mike Kerner  >
> > wrote:
> >
> >> I think that item is odd.  Quotes are, if memory serves, only supposed
> to
> >> appear if they are double-quoted.  Between "f" and "g" you have three
> >> quotes, and between "g" and "h" you only have one.  I believe that is
> not a
> >> correct csv format.
> >>
> >> On Sat, Oct 17, 2015 at 9:24 PM, Peter M. Brigham 
> >> wrote:
> >>
> >>> On Oct 17, 2015, at 8:47 PM, Alex Tweedly wrote:
> >>>
>  Also, I think (i.e. I haven't yet run the code, since I don't have
> >>> offsets() available) there is another mis-formed case you don't
> properly
> >>> detect :
>  a,b,c,"def"""g"h",i,j,k
> >>>
> >>> if I put this as one of the lines of my CSV data, it gets sorted into
> the
> >>> array properly. I think. That is, the 4th item of the line is
> >>>
> >>> "def"""g"h"
> >>>
> >>> Do you get the same result?
> >>>
> >>> -- Peter
> >>>
> >>> Peter M. Brigham
> >>> pmb...@gmail.com
> >>> http://home.comcast.net/~pmbrig
> >>>
> >>>
> >>>
> >>> ___
> >>> use-livecode mailing list
> >>> use-livecode@lists.runrev.com
> >>> Please visit this url to subscribe, unsubscribe and manage your
> >>> subscription preferences:
> >>> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>>
> >>
> >>
> >>
> >> --
> >> On the first day, God created the heavens and the Earth
> >> On the second day, God created the oceans.
> >> On the third day, God put the animals on hold for a few hours,
> >>   and did a little diving.
> >> And God said, "This is good."
> >>
> >
> >
> >
> > --
> > On the first day, God created the heavens and the Earth
> > On the second day, God created the oceans.
> > On the third day, God put the animals on hold for a few hours,
> >   and did a little diving.
> > And God said, "This is good."
> > ___
> > use-livecode mailing list
> > use-livecode@lists.runrev.com
> > Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

At this point, finding a function that does the task at all -- reliably and 
taking into account most of the csv malformations we can anticipate -- would be 
a start. So far nothing has been unbreakable. Once we find an algorithm that 
does the job, we can focus on speeding it up.

That said, I don't know that my solution is optimized for speed very well. It 
takes 4-5 seconds to process a 986 record file. On an old slow machine, a 2008 
MacBook 2.1 GHz Intel Core Duo, but still….

-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig

On Oct 17, 2015, at 10:05 PM, Mike Kerner wrote:

> The other thing that we are going to be interested in is finding the
> fastest function that performs the task.
> 
> On Sat, Oct 17, 2015 at 10:04 PM, Mike Kerner 
> wrote:
> 
>> I think that item is odd.  Quotes are, if memory serves, only supposed to
>> appear if they are double-quoted.  Between "f" and "g" you have three
>> quotes, and between "g" and "h" you only have one.  I believe that is not a
>> correct csv format.
>> 
>> On Sat, Oct 17, 2015 at 9:24 PM, Peter M. Brigham 
>> wrote:
>> 
>>> On Oct 17, 2015, at 8:47 PM, Alex Tweedly wrote:
>>> 
 Also, I think (i.e. I haven't yet run the code, since I don't have
>>> offsets() available) there is another mis-formed case you don't properly
>>> detect :
 a,b,c,"def"""g"h",i,j,k
>>> 
>>> if I put this as one of the lines of my CSV data, it gets sorted into the
>>> array properly. I think. That is, the 4th item of the line is
>>> 
>>> "def"""g"h"
>>> 
>>> Do you get the same result?
>>> 
>>> -- Peter
>>> 
>>> Peter M. Brigham
>>> pmb...@gmail.com
>>> http://home.comcast.net/~pmbrig
>>> 
>>> 
>>> 
>>> ___
>>> use-livecode mailing list
>>> use-livecode@lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>> 
>> 
>> 
>> 
>> --
>> On the first day, God created the heavens and the Earth
>> On the second day, God created the oceans.
>> On the third day, God put the animals on hold for a few hours,
>>   and did a little diving.
>> And God said, "This is good."
>> 
> 
> 
> 
> -- 
> On the first day, God created the heavens and the Earth
> On the second day, God created the oceans.
> On the third day, God put the animals on hold for a few hours,
>   and did a little diving.
> And God said, "This is good."
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

The other thing that we are going to be interested in is finding the
fastest function that performs the task.

On Sat, Oct 17, 2015 at 10:04 PM, Mike Kerner 
wrote:

> I think that item is odd.  Quotes are, if memory serves, only supposed to
> appear if they are double-quoted.  Between "f" and "g" you have three
> quotes, and between "g" and "h" you only have one.  I believe that is not a
> correct csv format.
>
> On Sat, Oct 17, 2015 at 9:24 PM, Peter M. Brigham 
> wrote:
>
>> On Oct 17, 2015, at 8:47 PM, Alex Tweedly wrote:
>>
>> > Also, I think (i.e. I haven't yet run the code, since I don't have
>> offsets() available) there is another mis-formed case you don't properly
>> detect :
>> > a,b,c,"def"""g"h",i,j,k
>>
>> if I put this as one of the lines of my CSV data, it gets sorted into the
>> array properly. I think. That is, the 4th item of the line is
>>
>> "def"""g"h"
>>
>>  Do you get the same result?
>>
>> -- Peter
>>
>> Peter M. Brigham
>> pmb...@gmail.com
>> http://home.comcast.net/~pmbrig
>>
>>
>>
>> ___
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>
>
> --
> On the first day, God created the heavens and the Earth
> On the second day, God created the oceans.
> On the third day, God put the animals on hold for a few hours,
>and did a little diving.
> And God said, "This is good."
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

I think that item is odd.  Quotes are, if memory serves, only supposed to
appear if they are double-quoted.  Between "f" and "g" you have three
quotes, and between "g" and "h" you only have one.  I believe that is not a
correct csv format.

On Sat, Oct 17, 2015 at 9:24 PM, Peter M. Brigham  wrote:

> On Oct 17, 2015, at 8:47 PM, Alex Tweedly wrote:
>
> > Also, I think (i.e. I haven't yet run the code, since I don't have
> offsets() available) there is another mis-formed case you don't properly
> detect :
> > a,b,c,"def"""g"h",i,j,k
>
> if I put this as one of the lines of my CSV data, it gets sorted into the
> array properly. I think. That is, the 4th item of the line is
>
> "def"""g"h"
>
>  Do you get the same result?
>
> -- Peter
>
> Peter M. Brigham
> pmb...@gmail.com
> http://home.comcast.net/~pmbrig
>
>
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

On Oct 17, 2015, at 8:47 PM, Alex Tweedly wrote:

> Also, I think (i.e. I haven't yet run the code, since I don't have offsets() 
> available) there is another mis-formed case you don't properly detect :
> a,b,c,"def"""g"h",i,j,k

if I put this as one of the lines of my CSV data, it gets sorted into the array 
properly. I think. That is, the 4th item of the line is 

"def"""g"h"

 Do you get the same result?

-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

Thanks for catching that. Change the if-then structure to:

if howmany(openQuoteChar,thisItem) <> howmany(closeQuoteChar,thisItem) then
return "This CSV data is not parsable (unclosed quotes in item)."
end if

Revised function:

function CSVtoArray pData
   -- by Peter M. Brigham, pmb...@gmail.com
   -- requires getDelimiters(), howmany(), offsets()
   put getDelimiters(pData,5) into tDelims
   put line 1 of tDelims into crChar
   put line 2 of tDelims into tabChar
   put line 3 of tDelims into commaChar
   put line 4 of tDelims into openQuoteChar
   put line 5 of tDelims into closeQuoteChar
   
   replace crlf with cr in pData  -- Win to UNIX
   replace numtochar(13) with cr in pData -- Mac to UNIX
   
   if howmany(quote,pData) mod 2 = 1 then
  return "This CSV data is not parsable (unclosed quotes in data)."
   end if
   
   put offsets(quote,pData) into qOffsets
   if qOffsets > 0 then
  put 1 into counter
  repeat for each item q in qOffsets
 if counter mod 2 = 1 then put openQuoteChar into char q of pData
 else put closeQuoteChar into char q of pData
 add 1 to counter
  end repeat
   end if
   
   put offsets(cr,pData) into crOffsets
   repeat for each item r in crOffsets
  put char 1 to r of pData into upToHere
  if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
 -- the cr is within a quoted string
 put crChar into char r of pData
  end if
   end repeat
   put offsets(tab,pData) into tabOffsets
   repeat for each item t in tabOffsets
  put char 1 to t of pData into upToHere
  if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
 -- the tab is within a quoted string
 put tabChar into char t of pData
  end if
   end repeat
   put offsets(comma,pData) into commaOffsets
   repeat for each item c in commaOffsets
  put char 1 to c of pData into upToHere
  if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
 -- the comma is within a quoted string
 put commaChar into char c of pData
  end if
   end repeat
   put 0 into lineCounter
   repeat for each line L in pData
  add 1 to lineCounter
  put 0 into itemCounter
  repeat for each item i in L
 add 1 to itemCounter
 put i into thisItem
 if howmany(openQuoteChar,thisItem) <> howmany(closeQuoteChar,thisItem) 
then
return "This CSV data is not parsable (unclosed quotes in item)."
 end if
 replace crChar with cr in thisItem
 replace tabChar with tab in thisItem
 replace commaChar with comma in thisItem
 replace openQuoteChar with quote in thisItem
 replace closeQuoteChar with quote in thisItem
 put thisItem into A[lineCounter][itemCounter]
  end repeat
   end repeat
   return A
end CSVtoArray

--

-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig


On Oct 17, 2015, at 8:47 PM, Alex Tweedly wrote:

> Ummm  surely at this point
> 
> 
> 
>  repeat for each item i in L
> add 1 to itemCounter
> put i into thisItem
> if howmany(quote,thisItem) mod 2 = 1 then
>return "This CSV data is not parsable (unclosed quotes in item)."
> end if
> 
> ...
> 
> howmany(quote,thisItem) must be 0 - all quotes have been replaced by either 
> openQuoteChar or closeQuoteChar
> 
> Shouldn't this test be
>   if howmany(openQuoteChar, thisItem) <> howmany(closeUqoteChar, thisItem)  
> then
> 
> 
> Also, I think (i.e. I haven't yet run the code, since I don't have offsets() 
> available) there is another mis-formed case you don't properly detect :
> a,b,c,"def"""g"h",i,j,k
> 
> The quoted cell contains the right number (i.e. a multiple of 2) of quotes, 
> but they are not suitably adjacent, so they can't be properly interpreted as 
> paired 'enclosed quotes'.   (I should say, none of the earlier versions 
> detect this either - their intent was to make the best feasible result from 
> well-formed data, and not to detect all malformed cases - but if this version 
> is going to detect and give error returns for error inputs in some cases, 
> then we should try to do it fully).
> 
> -- Alex.
> 
> 
> On 18/10/2015 00:41, Peter M. Brigham wrote:
>> So here's my attempt. It converts a CVS text to an array. Let's see if 
>> there's csv data that can break it.
>> 
>> -- Peter
>> 
>> Peter M. Brigham
>> pmb...@gmail.com
>> http://home.comcast.net/~pmbrig
>> 
>> ---
>> 
>> function CSVtoArray pData
>>-- by Peter M. Brigham, pmb...@gmail.com
>>-- requires getDelimiters(), howmany()
>>put getDelimiters(pData,5) into tDelims
>>put line 1 of tDelims into crChar
>>put line 2 of tDelims into tabChar
>>put line 3 of tDelims into commaChar
>>put line 4 of tDelims into openQuoteChar
>>put line 5 of tDelims into closeQuoteChar
>>replace crlf with cr in pData

Re: CSV again.

2015-10-17 Thread Alex Tweedly


Ummm  surely at this point



  repeat for each item i in L
 add 1 to itemCounter
 put i into thisItem
 if howmany(quote,thisItem) mod 2 = 1 then
return "This CSV data is not parsable (unclosed quotes in item)."
 end if

...

howmany(quote,thisItem) must be 0 - all quotes have been replaced by 
either openQuoteChar or closeQuoteChar


Shouldn't this test be
   if howmany(openQuoteChar, thisItem) <> howmany(closeUqoteChar, 
thisItem)  then



Also, I think (i.e. I haven't yet run the code, since I don't have 
offsets() available) there is another mis-formed case you don't properly 
detect :

a,b,c,"def"""g"h",i,j,k

The quoted cell contains the right number (i.e. a multiple of 2) of 
quotes, but they are not suitably adjacent, so they can't be properly 
interpreted as paired 'enclosed quotes'.   (I should say, none of the 
earlier versions detect this either - their intent was to make the best 
feasible result from well-formed data, and not to detect all malformed 
cases - but if this version is going to detect and give error returns 
for error inputs in some cases, then we should try to do it fully).


-- Alex.


On 18/10/2015 00:41, Peter M. Brigham wrote:

So here's my attempt. It converts a CVS text to an array. Let's see if there's 
csv data that can break it.

-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig

---

function CSVtoArray pData
-- by Peter M. Brigham, pmb...@gmail.com
-- requires getDelimiters(), howmany()
put getDelimiters(pData,5) into tDelims
put line 1 of tDelims into crChar
put line 2 of tDelims into tabChar
put line 3 of tDelims into commaChar
put line 4 of tDelims into openQuoteChar
put line 5 of tDelims into closeQuoteChar

replace crlf with cr in pData  -- Win to UNIX

replace numtochar(13) with cr in pData -- Mac to UNIX

if howmany(quote,pData) mod 2 = 1 then

   return "This CSV data is not parsable (unclosed quotes in data)."
end if

put offsets(quote,pData) into qOffsets

if qOffsets > 0 then
   put 1 into counter
   repeat for each item q in qOffsets
  if counter mod 2 = 1 then put openQuoteChar into char q of pData
  else put closeQuoteChar into char q of pData
  add 1 to counter
   end repeat
end if

put offsets(cr,pData) into crOffsets

repeat for each item r in crOffsets
   put char 1 to r of pData into upToHere
   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
  -- the cr is within a quoted string
  put crChar into char r of pData
   end if
end repeat
put offsets(tab,pData) into tabOffsets
repeat for each item t in tabOffsets
   put char 1 to t of pData into upToHere
   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
  -- the tab is within a quoted string
  put tabChar into char t of pData
   end if
end repeat
put offsets(comma,pData) into commaOffsets
repeat for each item c in commaOffsets
   put char 1 to c of pData into upToHere
   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
  -- the comma is within a quoted string
  put commaChar into char c of pData
   end if
end repeat
put 0 into lineCounter
repeat for each line L in pData
   add 1 to lineCounter
   put 0 into itemCounter
   repeat for each item i in L
  add 1 to itemCounter
  put i into thisItem
  if howmany(quote,thisItem) mod 2 = 1 then
 return "This CSV data is not parsable (unclosed quotes in item)."
  end if
  replace crChar with cr in thisItem
  replace tabChar with tab in thisItem
  replace commaChar with comma in thisItem
  replace openQuoteChar with quote in thisItem
  replace closeQuoteChar with quote in thisItem
  put thisItem into A[lineCounter][itemCounter]
   end repeat
end repeat
return A
end CSVtoArray

function getDelimiters pText, nbr
-- returns a cr-delimited list of  characters
--not found in the variable pText
-- use for delimiters for, eg, parsing text files, manipulating arrays, etc.
-- usage: put getDelimiters(pText,2) into tDelims
--if tDelims begins with "Error" then exit to top -- or whatever
--put line 1 of tDelims into lineDivider
--put line 2 of tDelims into itemDivider
-- etc.
-- by Peter M. Brigham, pmb...@gmail.com — freeware

if pText = empty then return "Error: no text specified."

if nbr = empty then put 1 into nbr -- default 1 delimiter
put "2,3,4,5,6,7,8,16,17,18,19,20,21,22,23,24,25,26" into baseList
-- low ASCII values, excluding CR, LF, tab, etc.
put the number of items of baseList into maxNbr
if nbr > maxNbr then return "Error: max"

Re: CSV again.

My mistake, failed to include the offsets() handler:

-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig

---

function offsets str, pContainer
   -- returns a comma-delimited list of all the offsets of str in pContainer
   -- returns 0 if not found
   -- note: offsets("xx","xx") returns "1,3,5" not "1,2,3,4,5"
   -- ie, overlapping offsets are not counted
   -- note: to get the last occurrence of a string in a container (often useful)
   -- use "item -1 of offsets(...)"
   -- by Peter M. Brigham, pmb...@gmail.com — freeware
   
   if str is not in pContainer then return 0
   put 0 into startPoint
   repeat
  put offset(str,pContainer,startPoint) into thisOffset
  if thisOffset = 0 then exit repeat
  add thisOffset to startPoint
  put startPoint & comma after offsetList
  add length(str)-1 to startPoint
   end repeat
   return item 1 to -1 of offsetList -- delete trailing comma
end offsets


On Oct 17, 2015, at 8:30 PM, Alex Tweedly wrote:

> Hi Peter,
> 
> it also requires offsets() - I can guess what it does, but it would be safer 
> to get the actual code you use :-)
> 
> Thanks
> -- Alex.
> 
> On 18/10/2015 00:41, Peter M. Brigham wrote:
>> So here's my attempt. It converts a CVS text to an array. Let's see if 
>> there's csv data that can break it.
>> 
>> -- Peter
>> 
>> Peter M. Brigham
>> pmb...@gmail.com
>> http://home.comcast.net/~pmbrig
>> 
>> ---
>> 
>> function CSVtoArray pData
>>-- by Peter M. Brigham, pmb...@gmail.com
>>-- requires getDelimiters(), howmany()
>>put getDelimiters(pData,5) into tDelims
>>put line 1 of tDelims into crChar
>>put line 2 of tDelims into tabChar
>>put line 3 of tDelims into commaChar
>>put line 4 of tDelims into openQuoteChar
>>put line 5 of tDelims into closeQuoteChar
>>replace crlf with cr in pData  -- Win to UNIX
>>replace numtochar(13) with cr in pData -- Mac to UNIX
>>if howmany(quote,pData) mod 2 = 1 then
>>   return "This CSV data is not parsable (unclosed quotes in data)."
>>end if
>>put offsets(quote,pData) into qOffsets
>>if qOffsets > 0 then
>>   put 1 into counter
>>   repeat for each item q in qOffsets
>>  if counter mod 2 = 1 then put openQuoteChar into char q of pData
>>  else put closeQuoteChar into char q of pData
>>  add 1 to counter
>>   end repeat
>>end if
>>put offsets(cr,pData) into crOffsets
>>repeat for each item r in crOffsets
>>   put char 1 to r of pData into upToHere
>>   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
>> then
>>  -- the cr is within a quoted string
>>  put crChar into char r of pData
>>   end if
>>end repeat
>>put offsets(tab,pData) into tabOffsets
>>repeat for each item t in tabOffsets
>>   put char 1 to t of pData into upToHere
>>   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
>> then
>>  -- the tab is within a quoted string
>>  put tabChar into char t of pData
>>   end if
>>end repeat
>>put offsets(comma,pData) into commaOffsets
>>repeat for each item c in commaOffsets
>>   put char 1 to c of pData into upToHere
>>   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
>> then
>>  -- the comma is within a quoted string
>>  put commaChar into char c of pData
>>   end if
>>end repeat
>>put 0 into lineCounter
>>repeat for each line L in pData
>>   add 1 to lineCounter
>>   put 0 into itemCounter
>>   repeat for each item i in L
>>  add 1 to itemCounter
>>  put i into thisItem
>>  if howmany(quote,thisItem) mod 2 = 1 then
>> return "This CSV data is not parsable (unclosed quotes in item)."
>>  end if
>>  replace crChar with cr in thisItem
>>  replace tabChar with tab in thisItem
>>  replace commaChar with comma in thisItem
>>  replace openQuoteChar with quote in thisItem
>>  replace closeQuoteChar with quote in thisItem
>>  put thisItem into A[lineCounter][itemCounter]
>>   end repeat
>>end repeat
>>return A
>> end CSVtoArray
>> 
>> function getDelimiters pText, nbr
>>-- returns a cr-delimited list of  characters
>>--not found in the variable pText
>>-- use for delimiters for, eg, parsing text files, manipulating arrays, 
>> etc.
>>-- usage: put getDelimiters(pText,2) into tDelims
>>--if tDelims begins with "Error" then exit to top -- or whatever
>>--put line 1 of tDelims into lineDivider
>>--put line 2 of tDelims into itemDivider
>>-- etc.
>>-- by Peter M. Brigham, pmb...@gmail.com — freeware
>>if pText = empty then return "Error: no text specified."
>>if nbr = empty then put 1 into nbr -- default 1 delimiter
>>put "2,3,4,5,6,7,8,16,1

Re: CSV again.

2015-10-17 Thread Alex Tweedly


Hi Peter,

it also requires offsets() - I can guess what it does, but it would be 
safer to get the actual code you use :-)


Thanks
-- Alex.

On 18/10/2015 00:41, Peter M. Brigham wrote:

So here's my attempt. It converts a CVS text to an array. Let's see if there's 
csv data that can break it.

-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig

---

function CSVtoArray pData
-- by Peter M. Brigham, pmb...@gmail.com
-- requires getDelimiters(), howmany()
put getDelimiters(pData,5) into tDelims
put line 1 of tDelims into crChar
put line 2 of tDelims into tabChar
put line 3 of tDelims into commaChar
put line 4 of tDelims into openQuoteChar
put line 5 of tDelims into closeQuoteChar

replace crlf with cr in pData  -- Win to UNIX

replace numtochar(13) with cr in pData -- Mac to UNIX

if howmany(quote,pData) mod 2 = 1 then

   return "This CSV data is not parsable (unclosed quotes in data)."
end if

put offsets(quote,pData) into qOffsets

if qOffsets > 0 then
   put 1 into counter
   repeat for each item q in qOffsets
  if counter mod 2 = 1 then put openQuoteChar into char q of pData
  else put closeQuoteChar into char q of pData
  add 1 to counter
   end repeat
end if

put offsets(cr,pData) into crOffsets

repeat for each item r in crOffsets
   put char 1 to r of pData into upToHere
   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
  -- the cr is within a quoted string
  put crChar into char r of pData
   end if
end repeat
put offsets(tab,pData) into tabOffsets
repeat for each item t in tabOffsets
   put char 1 to t of pData into upToHere
   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
  -- the tab is within a quoted string
  put tabChar into char t of pData
   end if
end repeat
put offsets(comma,pData) into commaOffsets
repeat for each item c in commaOffsets
   put char 1 to c of pData into upToHere
   if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
  -- the comma is within a quoted string
  put commaChar into char c of pData
   end if
end repeat
put 0 into lineCounter
repeat for each line L in pData
   add 1 to lineCounter
   put 0 into itemCounter
   repeat for each item i in L
  add 1 to itemCounter
  put i into thisItem
  if howmany(quote,thisItem) mod 2 = 1 then
 return "This CSV data is not parsable (unclosed quotes in item)."
  end if
  replace crChar with cr in thisItem
  replace tabChar with tab in thisItem
  replace commaChar with comma in thisItem
  replace openQuoteChar with quote in thisItem
  replace closeQuoteChar with quote in thisItem
  put thisItem into A[lineCounter][itemCounter]
   end repeat
end repeat
return A
end CSVtoArray

function getDelimiters pText, nbr
-- returns a cr-delimited list of  characters
--not found in the variable pText
-- use for delimiters for, eg, parsing text files, manipulating arrays, etc.
-- usage: put getDelimiters(pText,2) into tDelims
--if tDelims begins with "Error" then exit to top -- or whatever
--put line 1 of tDelims into lineDivider
--put line 2 of tDelims into itemDivider
-- etc.
-- by Peter M. Brigham, pmb...@gmail.com — freeware

if pText = empty then return "Error: no text specified."

if nbr = empty then put 1 into nbr -- default 1 delimiter
put "2,3,4,5,6,7,8,16,17,18,19,20,21,22,23,24,25,26" into baseList
-- low ASCII values, excluding CR, LF, tab, etc.
put the number of items of baseList into maxNbr
if nbr > maxNbr then return "Error: max" && maxNbr && "delimiters."
repeat with tCount = 1 to nbr
   put true into failed
   repeat with i = 1 to the number of items of baseList
  put item i of baseList into testNbr
  put numtochar(testNbr) into testChar
  if testChar is not in pText then
 -- found one, store and get next delim
 put false into failed
 put testChar into line tCount of delimList
 exit repeat
  end if
   end repeat
   if failed then
  if tCount = 0 then
 return "Error: cannot get any delimiters."
  else if tCount = 1 then
 return "Error: can only get one delimiter."
  else
 return "Error: can only get" && tCount && "delimiters."
  end if
   end if
   delete item i of baseList
end repeat
return delimList
end getDelimiters

function howmany pStr, pContainer, pCaseSens
-- how many times pStr occurs in pContainer
-- note that howmany("00","00") returns 3, not 5

Re: CSV again.

So here's my attempt. It converts a CVS text to an array. Let's see if there's 
csv data that can break it.

-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig

---

function CSVtoArray pData
   -- by Peter M. Brigham, pmb...@gmail.com
   -- requires getDelimiters(), howmany()
   put getDelimiters(pData,5) into tDelims
   put line 1 of tDelims into crChar
   put line 2 of tDelims into tabChar
   put line 3 of tDelims into commaChar
   put line 4 of tDelims into openQuoteChar
   put line 5 of tDelims into closeQuoteChar
   
   replace crlf with cr in pData  -- Win to UNIX
   replace numtochar(13) with cr in pData -- Mac to UNIX
   
   if howmany(quote,pData) mod 2 = 1 then
  return "This CSV data is not parsable (unclosed quotes in data)."
   end if
   
   put offsets(quote,pData) into qOffsets
   if qOffsets > 0 then
  put 1 into counter
  repeat for each item q in qOffsets
 if counter mod 2 = 1 then put openQuoteChar into char q of pData
 else put closeQuoteChar into char q of pData
 add 1 to counter
  end repeat
   end if
   
   put offsets(cr,pData) into crOffsets
   repeat for each item r in crOffsets
  put char 1 to r of pData into upToHere
  if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
 -- the cr is within a quoted string
 put crChar into char r of pData
  end if
   end repeat
   put offsets(tab,pData) into tabOffsets
   repeat for each item t in tabOffsets
  put char 1 to t of pData into upToHere
  if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
 -- the tab is within a quoted string
 put tabChar into char t of pData
  end if
   end repeat
   put offsets(comma,pData) into commaOffsets
   repeat for each item c in commaOffsets
  put char 1 to c of pData into upToHere
  if howmany(openQuoteChar,upToHere) <> howmany(closeQuoteChar,upToHere) 
then
 -- the comma is within a quoted string
 put commaChar into char c of pData
  end if
   end repeat
   put 0 into lineCounter
   repeat for each line L in pData
  add 1 to lineCounter
  put 0 into itemCounter
  repeat for each item i in L
 add 1 to itemCounter
 put i into thisItem
 if howmany(quote,thisItem) mod 2 = 1 then
return "This CSV data is not parsable (unclosed quotes in item)."
 end if
 replace crChar with cr in thisItem
 replace tabChar with tab in thisItem
 replace commaChar with comma in thisItem
 replace openQuoteChar with quote in thisItem
 replace closeQuoteChar with quote in thisItem
 put thisItem into A[lineCounter][itemCounter]
  end repeat
   end repeat
   return A
end CSVtoArray

function getDelimiters pText, nbr
   -- returns a cr-delimited list of  characters
   --not found in the variable pText
   -- use for delimiters for, eg, parsing text files, manipulating arrays, etc.
   -- usage: put getDelimiters(pText,2) into tDelims
   --if tDelims begins with "Error" then exit to top -- or whatever
   --put line 1 of tDelims into lineDivider
   --put line 2 of tDelims into itemDivider
   -- etc.
   -- by Peter M. Brigham, pmb...@gmail.com — freeware
   
   if pText = empty then return "Error: no text specified."
   if nbr = empty then put 1 into nbr -- default 1 delimiter
   put "2,3,4,5,6,7,8,16,17,18,19,20,21,22,23,24,25,26" into baseList
   -- low ASCII values, excluding CR, LF, tab, etc.
   put the number of items of baseList into maxNbr
   if nbr > maxNbr then return "Error: max" && maxNbr && "delimiters."
   repeat with tCount = 1 to nbr
  put true into failed
  repeat with i = 1 to the number of items of baseList
 put item i of baseList into testNbr
 put numtochar(testNbr) into testChar
 if testChar is not in pText then
-- found one, store and get next delim
put false into failed
put testChar into line tCount of delimList
exit repeat
 end if
  end repeat
  if failed then
 if tCount = 0 then
return "Error: cannot get any delimiters."
 else if tCount = 1 then
return "Error: can only get one delimiter."
 else
return "Error: can only get" && tCount && "delimiters."
 end if
  end if
  delete item i of baseList
   end repeat
   return delimList
end getDelimiters

function howmany pStr, pContainer, pCaseSens
   -- how many times pStr occurs in pContainer
   -- note that howmany("00","00") returns 3, not 5
   -- ie,  overlapping matches are not counted
   -- by Peter M. Brigham, pmb...@gmail.com — freeware
   
   if pCaseSens = empty then put false into pCaseSens
   set the casesensitive to pCaseSens
   if pStr is not in pContainer then return 0
   put len(pContainer) into origLength
   replace pStr with ch

Re: CSV again.

I added it to my repository on GitHub if anyone wants to try to do this in
Git.

On Sat, Oct 17, 2015 at 10:53 AM, Mike Kerner 
wrote:

> I am going to put 4 on Git and have at it.
>
> 1) There are other assumptions being made, like assuming that the  and
>  don't appear in the incoming text.  Instead of hardcoding the interim
> substitutions, determine what the interim substitutions are going to be
> (can also allow the user to specify them).  Characters that we need to deal
> with are quote, ,, and comma.
>
> 2) In this version, you can specify the incoming column delimiter.  Add
> the ability for the caller to specify the record delimiter before, the
> column and record delimiters after, and what substitutions are going to be
> used, after.  For example, for embedded 's, perhaps the user wants <13>
> or even a string like a semicolon and a space
>
>
> On Sat, Oct 17, 2015 at 5:03 AM, Alex Tweedly  wrote:
>
>> Naturally it must be removed.
>>
>> But I have a more philosophical issue / question.
>>
>>
>> TSV (in and of itself) doesn't have any quotes, and so doesn't handle
>> quoted CRs or TABs.
>>
>> Currently, the 'old' version - as in Richard's published article, doesn't
>> handle TAB characters enclosed within a quoted cell. The 'new' version does
>> - but only by returning the data delimited by  instead of TAB, and
>> leaving enclosed TABs alone - a mistake, IMHO.
>>
>> I believe that what the converter should do is :
>>  - return TSV - i.e. delimited by TABs
>>  - replace quoted CR by  within quoted cells (as it does now)
>>  - replace quoted TABs by  within quoted cells
>>
>> Any comments or suggestions ?
>>
>> Thanks
>> Alex.
>>
>>
>> On 17/10/2015 02:34, Mike Kerner wrote:
>>
>>> It's safe as long as you remember to remove it at the end of the function
>>>
>>> On Fri, Oct 16, 2015 at 7:12 PM, Alex Tweedly  wrote:
>>>
>>> Duh - replying to myself again :-)

 It looks as though that's exactly what you do mean - it certainly
 generates the problems you described earlier. And my one-line additional
 test would (does in my testing) solve it properly - without it, we don't
 get a chance to flush "theInsideStringSoFar" to tNuData, with the extra
 line we do. And adding it is always safe (AFAICI).

 -- Alex.


 On 17/10/2015 00:03, Alex Tweedly wrote:

 Sorry, Mike, but can you describe what you mean by a "naked" line ?
> Is it simply one with no line delimiter after it ?
> i.e. could only happen on the very last line of a file of input ?
>
> Could that be solved by a simple test (after the various 'replace'
> statements)
>  if the last char of pData <> CR then put CR after pData
> before the parsing happens ?
>
> -- Alex.
>
>
> On 16/10/2015 17:19, Mike Kerner wrote:
>
> No, the problem isn't that LC use LF and CR for ascii(10) and ignores
>> ascii(13).  That's just a personal problem.
>>
>> The problem, here, is that the csv parser handles a naked line and a
>> terminated line differently.  If the line is terminated, it parses it
>> one
>> way, and if it is not, it parses it (incorrectly) a different way,
>> which
>> makes me wonder if this is the latest version.
>>
>> On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar <
>> bobsnei...@iotecdigital.com>
>> wrote:
>>
>> But what if the cr or lf or crlf is inside quoted text, meaning it is
>> not
>>
>>> a delimiter? Oh, I'm afraid the deflector shield will be quite
>>> operational
>>> when your friends arrive.
>>>
>>> Bob S
>>>
>>>
>>> On Oct 16, 2015, at 08:04 , Alex Tweedly  wrote:
>>>
 Hi Mike,

 thanks for that additional info.

 I *think* (it's been 3 years) I left them as  (i.e.
 numtochar(29))

 because I had some data including normal TAB characters within the
>>> cells
>>> (!!( and thought  was a safer bet - though of course nothing is
>>> completely safe. It's then up to the caller to decide whether to do
>>> "replace numtochar(29) with TAB in ...", or do TAB escaping, or
>>> whatever
>>> they want.
>>>
>>> As for the other bigger problem  Oh dear = CR vs LF vs CRLF 

 Are you on Mac or Windows or Linux ?
 How is the LF delimited data getting into your app ?
 Maybe we should just add a "replace chartonum(13) with CR in pData"
 ?

 (I confess to being confused by this - I know that LC does

 auto-translation of line delimiters at various places, but I'm not
>>> sure
>>> when it is, or isn't, completely safe. Maybe the easiest thing is to
>>> jst do
>>> all the translations 
>>>
>>>replace CRLF with CR in pData
replace numtochar(10) with CR in pData
replace numtochar(13) with CR in pData

 -- Alex.
>>

Re: CSV again.

I am going to put 4 on Git and have at it.

1) There are other assumptions being made, like assuming that the  and
 don't appear in the incoming text.  Instead of hardcoding the interim
substitutions, determine what the interim substitutions are going to be
(can also allow the user to specify them).  Characters that we need to deal
with are quote, ,, and comma.

2) In this version, you can specify the incoming column delimiter.  Add the
ability for the caller to specify the record delimiter before, the column
and record delimiters after, and what substitutions are going to be used,
after.  For example, for embedded 's, perhaps the user wants <13> or
even a string like a semicolon and a space


On Sat, Oct 17, 2015 at 5:03 AM, Alex Tweedly  wrote:

> Naturally it must be removed.
>
> But I have a more philosophical issue / question.
>
>
> TSV (in and of itself) doesn't have any quotes, and so doesn't handle
> quoted CRs or TABs.
>
> Currently, the 'old' version - as in Richard's published article, doesn't
> handle TAB characters enclosed within a quoted cell. The 'new' version does
> - but only by returning the data delimited by  instead of TAB, and
> leaving enclosed TABs alone - a mistake, IMHO.
>
> I believe that what the converter should do is :
>  - return TSV - i.e. delimited by TABs
>  - replace quoted CR by  within quoted cells (as it does now)
>  - replace quoted TABs by  within quoted cells
>
> Any comments or suggestions ?
>
> Thanks
> Alex.
>
>
> On 17/10/2015 02:34, Mike Kerner wrote:
>
>> It's safe as long as you remember to remove it at the end of the function
>>
>> On Fri, Oct 16, 2015 at 7:12 PM, Alex Tweedly  wrote:
>>
>> Duh - replying to myself again :-)
>>>
>>> It looks as though that's exactly what you do mean - it certainly
>>> generates the problems you described earlier. And my one-line additional
>>> test would (does in my testing) solve it properly - without it, we don't
>>> get a chance to flush "theInsideStringSoFar" to tNuData, with the extra
>>> line we do. And adding it is always safe (AFAICI).
>>>
>>> -- Alex.
>>>
>>>
>>> On 17/10/2015 00:03, Alex Tweedly wrote:
>>>
>>> Sorry, Mike, but can you describe what you mean by a "naked" line ?
 Is it simply one with no line delimiter after it ?
 i.e. could only happen on the very last line of a file of input ?

 Could that be solved by a simple test (after the various 'replace'
 statements)
  if the last char of pData <> CR then put CR after pData
 before the parsing happens ?

 -- Alex.


 On 16/10/2015 17:19, Mike Kerner wrote:

 No, the problem isn't that LC use LF and CR for ascii(10) and ignores
> ascii(13).  That's just a personal problem.
>
> The problem, here, is that the csv parser handles a naked line and a
> terminated line differently.  If the line is terminated, it parses it
> one
> way, and if it is not, it parses it (incorrectly) a different way,
> which
> makes me wonder if this is the latest version.
>
> On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar <
> bobsnei...@iotecdigital.com>
> wrote:
>
> But what if the cr or lf or crlf is inside quoted text, meaning it is
> not
>
>> a delimiter? Oh, I'm afraid the deflector shield will be quite
>> operational
>> when your friends arrive.
>>
>> Bob S
>>
>>
>> On Oct 16, 2015, at 08:04 , Alex Tweedly  wrote:
>>
>>> Hi Mike,
>>>
>>> thanks for that additional info.
>>>
>>> I *think* (it's been 3 years) I left them as  (i.e.
>>> numtochar(29))
>>>
>>> because I had some data including normal TAB characters within the
>> cells
>> (!!( and thought  was a safer bet - though of course nothing is
>> completely safe. It's then up to the caller to decide whether to do
>> "replace numtochar(29) with TAB in ...", or do TAB escaping, or
>> whatever
>> they want.
>>
>> As for the other bigger problem  Oh dear = CR vs LF vs CRLF 
>>>
>>> Are you on Mac or Windows or Linux ?
>>> How is the LF delimited data getting into your app ?
>>> Maybe we should just add a "replace chartonum(13) with CR in pData" ?
>>>
>>> (I confess to being confused by this - I know that LC does
>>>
>>> auto-translation of line delimiters at various places, but I'm not
>> sure
>> when it is, or isn't, completely safe. Maybe the easiest thing is to
>> jst do
>> all the translations 
>>
>>replace CRLF with CR in pData
>>>replace numtochar(10) with CR in pData
>>>replace numtochar(13) with CR in pData
>>>
>>> -- Alex.
>>>
>>> ___
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-liv

Re: CSV again.

2015-10-17 Thread Alex Tweedly


Naturally it must be removed.

But I have a more philosophical issue / question.


TSV (in and of itself) doesn't have any quotes, and so doesn't handle 
quoted CRs or TABs.


Currently, the 'old' version - as in Richard's published article, 
doesn't handle TAB characters enclosed within a quoted cell. The 'new' 
version does - but only by returning the data delimited by  instead 
of TAB, and leaving enclosed TABs alone - a mistake, IMHO.


I believe that what the converter should do is :
 - return TSV - i.e. delimited by TABs
 - replace quoted CR by  within quoted cells (as it does now)
 - replace quoted TABs by  within quoted cells

Any comments or suggestions ?

Thanks
Alex.

On 17/10/2015 02:34, Mike Kerner wrote:

It's safe as long as you remember to remove it at the end of the function

On Fri, Oct 16, 2015 at 7:12 PM, Alex Tweedly  wrote:


Duh - replying to myself again :-)

It looks as though that's exactly what you do mean - it certainly
generates the problems you described earlier. And my one-line additional
test would (does in my testing) solve it properly - without it, we don't
get a chance to flush "theInsideStringSoFar" to tNuData, with the extra
line we do. And adding it is always safe (AFAICI).

-- Alex.


On 17/10/2015 00:03, Alex Tweedly wrote:


Sorry, Mike, but can you describe what you mean by a "naked" line ?
Is it simply one with no line delimiter after it ?
i.e. could only happen on the very last line of a file of input ?

Could that be solved by a simple test (after the various 'replace'
statements)
 if the last char of pData <> CR then put CR after pData
before the parsing happens ?

-- Alex.


On 16/10/2015 17:19, Mike Kerner wrote:


No, the problem isn't that LC use LF and CR for ascii(10) and ignores
ascii(13).  That's just a personal problem.

The problem, here, is that the csv parser handles a naked line and a
terminated line differently.  If the line is terminated, it parses it one
way, and if it is not, it parses it (incorrectly) a different way, which
makes me wonder if this is the latest version.

On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar <
bobsnei...@iotecdigital.com>
wrote:

But what if the cr or lf or crlf is inside quoted text, meaning it is not

a delimiter? Oh, I'm afraid the deflector shield will be quite
operational
when your friends arrive.

Bob S


On Oct 16, 2015, at 08:04 , Alex Tweedly  wrote:

Hi Mike,

thanks for that additional info.

I *think* (it's been 3 years) I left them as  (i.e. numtochar(29))


because I had some data including normal TAB characters within the cells
(!!( and thought  was a safer bet - though of course nothing is
completely safe. It's then up to the caller to decide whether to do
"replace numtochar(29) with TAB in ...", or do TAB escaping, or whatever
they want.


As for the other bigger problem  Oh dear = CR vs LF vs CRLF 

Are you on Mac or Windows or Linux ?
How is the LF delimited data getting into your app ?
Maybe we should just add a "replace chartonum(13) with CR in pData" ?

(I confess to being confused by this - I know that LC does


auto-translation of line delimiters at various places, but I'm not sure
when it is, or isn't, completely safe. Maybe the easiest thing is to
jst do
all the translations 


   replace CRLF with CR in pData
   replace numtochar(10) with CR in pData
   replace numtochar(13) with CR in pData

-- Alex.


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode





___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode



___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode







___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

2015-10-16 Thread Mike Kerner

It's safe as long as you remember to remove it at the end of the function

On Fri, Oct 16, 2015 at 7:12 PM, Alex Tweedly  wrote:

> Duh - replying to myself again :-)
>
> It looks as though that's exactly what you do mean - it certainly
> generates the problems you described earlier. And my one-line additional
> test would (does in my testing) solve it properly - without it, we don't
> get a chance to flush "theInsideStringSoFar" to tNuData, with the extra
> line we do. And adding it is always safe (AFAICI).
>
> -- Alex.
>
>
> On 17/10/2015 00:03, Alex Tweedly wrote:
>
>> Sorry, Mike, but can you describe what you mean by a "naked" line ?
>> Is it simply one with no line delimiter after it ?
>> i.e. could only happen on the very last line of a file of input ?
>>
>> Could that be solved by a simple test (after the various 'replace'
>> statements)
>> if the last char of pData <> CR then put CR after pData
>> before the parsing happens ?
>>
>> -- Alex.
>>
>>
>> On 16/10/2015 17:19, Mike Kerner wrote:
>>
>>> No, the problem isn't that LC use LF and CR for ascii(10) and ignores
>>> ascii(13).  That's just a personal problem.
>>>
>>> The problem, here, is that the csv parser handles a naked line and a
>>> terminated line differently.  If the line is terminated, it parses it one
>>> way, and if it is not, it parses it (incorrectly) a different way, which
>>> makes me wonder if this is the latest version.
>>>
>>> On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar <
>>> bobsnei...@iotecdigital.com>
>>> wrote:
>>>
>>> But what if the cr or lf or crlf is inside quoted text, meaning it is not
 a delimiter? Oh, I'm afraid the deflector shield will be quite
 operational
 when your friends arrive.

 Bob S


 On Oct 16, 2015, at 08:04 , Alex Tweedly  wrote:
>
> Hi Mike,
>
> thanks for that additional info.
>
> I *think* (it's been 3 years) I left them as  (i.e. numtochar(29))
>
 because I had some data including normal TAB characters within the cells
 (!!( and thought  was a safer bet - though of course nothing is
 completely safe. It's then up to the caller to decide whether to do
 "replace numtochar(29) with TAB in ...", or do TAB escaping, or whatever
 they want.

> As for the other bigger problem  Oh dear = CR vs LF vs CRLF 
>
> Are you on Mac or Windows or Linux ?
> How is the LF delimited data getting into your app ?
> Maybe we should just add a "replace chartonum(13) with CR in pData" ?
>
> (I confess to being confused by this - I know that LC does
>
 auto-translation of line delimiters at various places, but I'm not sure
 when it is, or isn't, completely safe. Maybe the easiest thing is to
 jst do
 all the translations 

>   replace CRLF with CR in pData
>   replace numtochar(10) with CR in pData
>   replace numtochar(13) with CR in pData
>
> -- Alex.
>

 ___
 use-livecode mailing list
 use-livecode@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your
 subscription preferences:
 http://lists.runrev.com/mailman/listinfo/use-livecode


>>>
>>>
>>
>> ___
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.


Duh - replying to myself again :-)

It looks as though that's exactly what you do mean - it certainly 
generates the problems you described earlier. And my one-line additional 
test would (does in my testing) solve it properly - without it, we don't 
get a chance to flush "theInsideStringSoFar" to tNuData, with the extra 
line we do. And adding it is always safe (AFAICI).


-- Alex.

On 17/10/2015 00:03, Alex Tweedly wrote:

Sorry, Mike, but can you describe what you mean by a "naked" line ?
Is it simply one with no line delimiter after it ?
i.e. could only happen on the very last line of a file of input ?

Could that be solved by a simple test (after the various 'replace' 
statements)

if the last char of pData <> CR then put CR after pData
before the parsing happens ?

-- Alex.


On 16/10/2015 17:19, Mike Kerner wrote:

No, the problem isn't that LC use LF and CR for ascii(10) and ignores
ascii(13).  That's just a personal problem.

The problem, here, is that the csv parser handles a naked line and a
terminated line differently.  If the line is terminated, it parses it 
one

way, and if it is not, it parses it (incorrectly) a different way, which
makes me wonder if this is the latest version.

On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar 


wrote:

But what if the cr or lf or crlf is inside quoted text, meaning it 
is not
a delimiter? Oh, I'm afraid the deflector shield will be quite 
operational

when your friends arrive.

Bob S



On Oct 16, 2015, at 08:04 , Alex Tweedly  wrote:

Hi Mike,

thanks for that additional info.

I *think* (it's been 3 years) I left them as  (i.e. numtochar(29))
because I had some data including normal TAB characters within the 
cells

(!!( and thought  was a safer bet - though of course nothing is
completely safe. It's then up to the caller to decide whether to do
"replace numtochar(29) with TAB in ...", or do TAB escaping, or 
whatever

they want.

As for the other bigger problem  Oh dear = CR vs LF vs CRLF 

Are you on Mac or Windows or Linux ?
How is the LF delimited data getting into your app ?
Maybe we should just add a "replace chartonum(13) with CR in pData" ?

(I confess to being confused by this - I know that LC does

auto-translation of line delimiters at various places, but I'm not sure
when it is, or isn't, completely safe. Maybe the easiest thing is to 
jst do

all the translations 

  replace CRLF with CR in pData
  replace numtochar(10) with CR in pData
  replace numtochar(13) with CR in pData

-- Alex.


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode







___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your 
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-livecode



___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.


Sorry, Mike, but can you describe what you mean by a "naked" line ?
Is it simply one with no line delimiter after it ?
i.e. could only happen on the very last line of a file of input ?

Could that be solved by a simple test (after the various 'replace' 
statements)

if the last char of pData <> CR then put CR after pData
before the parsing happens ?

-- Alex.


On 16/10/2015 17:19, Mike Kerner wrote:

No, the problem isn't that LC use LF and CR for ascii(10) and ignores
ascii(13).  That's just a personal problem.

The problem, here, is that the csv parser handles a naked line and a
terminated line differently.  If the line is terminated, it parses it one
way, and if it is not, it parses it (incorrectly) a different way, which
makes me wonder if this is the latest version.

On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar 
wrote:


But what if the cr or lf or crlf is inside quoted text, meaning it is not
a delimiter? Oh, I'm afraid the deflector shield will be quite operational
when your friends arrive.

Bob S



On Oct 16, 2015, at 08:04 , Alex Tweedly  wrote:

Hi Mike,

thanks for that additional info.

I *think* (it's been 3 years) I left them as  (i.e. numtochar(29))

because I had some data including normal TAB characters within the cells
(!!( and thought  was a safer bet - though of course nothing is
completely safe. It's then up to the caller to decide whether to do
"replace numtochar(29) with TAB in ...", or do TAB escaping, or whatever
they want.

As for the other bigger problem    Oh dear = CR vs LF vs CRLF 

Are you on Mac or Windows or Linux ?
How is the LF delimited data getting into your app ?
Maybe we should just add a "replace chartonum(13) with CR in pData" ?

(I confess to being confused by this - I know that LC does

auto-translation of line delimiters at various places, but I'm not sure
when it is, or isn't, completely safe. Maybe the easiest thing is to jst do
all the translations 

  replace CRLF with CR in pData
  replace numtochar(10) with CR in pData
  replace numtochar(13) with CR in pData

-- Alex.


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode







___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

It's likely (but of course not 100% guaranteed) that those characters
have themselves been manipulated in a consistent way by either LC or any
other subsystem - i.e. auto-translated or not.

Anyone who chooses to use those as genuinely different characters within
quoted cells *deserves* to have their data be unreadable :-)

-- Alex.

On 16/10/2015 16:28, Bob Sneidar wrote:

But what if the cr or lf or crlf is inside quoted text, meaning it is not a
delimiter? Oh, I'm afraid the deflector shield will be quite operational when
your friends arrive.

Bob S

On Oct 16, 2015, at 08:04 , Alex Tweedly wrote:

Hi Mike,

thanks for that additional info.

I *think* (it's been 3 years) I left them as (i.e. numtochar(29)) because I had some data
including normal TAB characters within the cells (!!( and thought was a safer bet - though
of course nothing is completely safe. It's then up to the caller to decide whether to do
"replace numtochar(29) with TAB in ...", or do TAB escaping, or whatever they want.

As for the other bigger problem Oh dear = CR vs LF vs CRLF

Are you on Mac or Windows or Linux ?
How is the LF delimited data getting into your app ?
Maybe we should just add a "replace chartonum(13) with CR in pData" ?

(I confess to being confused by this - I know that LC does auto-translation of
line delimiters at various places, but I'm not sure when it is, or isn't,
completely safe. Maybe the easiest thing is to jst do all the translations

replace CRLF with CR in pData
replace numtochar(10) with CR in pData
replace numtochar(13) with CR in pData

-- Alex.

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

2015-10-16 Thread Bob Sneidar

The force is strong with this one.

Bob S


On Oct 16, 2015, at 09:19 , Mike Kerner 
mailto:mikeker...@roadrunner.com>> wrote:

No, the problem isn't that LC use LF and CR for ascii(10) and ignores
ascii(13).  That's just a personal problem.

The problem, here, is that the csv parser handles a naked line and a
terminated line differently.  If the line is terminated, it parses it one
way, and if it is not, it parses it (incorrectly) a different way, which
makes me wonder if this is the latest version.

On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar 
mailto:bobsnei...@iotecdigital.com>>
wrote:

But what if the cr or lf or crlf is inside quoted text, meaning it is not
a delimiter? Oh, I'm afraid the deflector shield will be quite operational
when your friends arrive.

Bob S

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

2015-10-16 Thread Mike Kerner

No, the problem isn't that LC use LF and CR for ascii(10) and ignores
ascii(13).  That's just a personal problem.

The problem, here, is that the csv parser handles a naked line and a
terminated line differently.  If the line is terminated, it parses it one
way, and if it is not, it parses it (incorrectly) a different way, which
makes me wonder if this is the latest version.

On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar 
wrote:

> But what if the cr or lf or crlf is inside quoted text, meaning it is not
> a delimiter? Oh, I'm afraid the deflector shield will be quite operational
> when your friends arrive.
>
> Bob S
>
>
> > On Oct 16, 2015, at 08:04 , Alex Tweedly  wrote:
> >
> > Hi Mike,
> >
> > thanks for that additional info.
> >
> > I *think* (it's been 3 years) I left them as  (i.e. numtochar(29))
> because I had some data including normal TAB characters within the cells
> (!!( and thought  was a safer bet - though of course nothing is
> completely safe. It's then up to the caller to decide whether to do
> "replace numtochar(29) with TAB in ...", or do TAB escaping, or whatever
> they want.
> >
> > As for the other bigger problem    Oh dear = CR vs LF vs CRLF 
> >
> > Are you on Mac or Windows or Linux ?
> > How is the LF delimited data getting into your app ?
> > Maybe we should just add a "replace chartonum(13) with CR in pData" ?
> >
> > (I confess to being confused by this - I know that LC does
> auto-translation of line delimiters at various places, but I'm not sure
> when it is, or isn't, completely safe. Maybe the easiest thing is to jst do
> all the translations 
> >
> >  replace CRLF with CR in pData
> >  replace numtochar(10) with CR in pData
> >  replace numtochar(13) with CR in pData
> >
> > -- Alex.
>
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.

2015-10-16 Thread Bob Sneidar

But what if the cr or lf or crlf is inside quoted text, meaning it is not a 
delimiter? Oh, I'm afraid the deflector shield will be quite operational when 
your friends arrive.

Bob S


> On Oct 16, 2015, at 08:04 , Alex Tweedly  wrote:
> 
> Hi Mike,
> 
> thanks for that additional info.
> 
> I *think* (it's been 3 years) I left them as  (i.e. numtochar(29)) 
> because I had some data including normal TAB characters within the cells (!!( 
> and thought  was a safer bet - though of course nothing is completely 
> safe. It's then up to the caller to decide whether to do "replace 
> numtochar(29) with TAB in ...", or do TAB escaping, or whatever they want.
> 
> As for the other bigger problem    Oh dear = CR vs LF vs CRLF 
> 
> Are you on Mac or Windows or Linux ?
> How is the LF delimited data getting into your app ?
> Maybe we should just add a "replace chartonum(13) with CR in pData" ?
> 
> (I confess to being confused by this - I know that LC does auto-translation 
> of line delimiters at various places, but I'm not sure when it is, or isn't, 
> completely safe. Maybe the easiest thing is to jst do all the translations 
> 
> 
>  replace CRLF with CR in pData
>  replace numtochar(10) with CR in pData
>  replace numtochar(13) with CR in pData
> 
> -- Alex.


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: CSV again.