John,
unless you are reading data from a stream -
why don't you just read the entire document into memory, THEN parse it?
every reference to the document (on disk) will take longer - by a factor of up
to 1000 - than referencing the same information from memory. (spinning disks
access in milliseconds, memory in nano seconds, SSDs are in between).
so read the entire document -
examine the text until you find a CR, LF or combination -
if you find one check the next and/or previous character for the other.
now you know your EOL.
You can not depend on the platform for the EOL determination.
A File created on a Mac can be read on a PC and vise verse.
if you need to process 1 'line' at a time just step through the text ending on
the EOL from above.
you can check the size of the text, or the counter/pointer into your text
(depending on your processing method) to determine if yo have processed
everything.
something like this (assumes document in text var):
Last_Char_Counter:=0
repeat
local_text := get_text_to_next_EOL(TextVar; -> Last_Char_Counter)
if (local_text #"")
process text
end if
until (length(local_text) < 5) | (Last_Char_Counter + 5>=length(TextVar))
5 is to account for various extra characters I have often found at the end of
documents, like multiple Cr/CrLf etc, YMMV on this number or it's need.
Chip
>> On Mar 25, 2020, at 8:21 PM, Keisuke Miyako via 4D_Tech
>> <[email protected]> wrote:
>>
>> sometimes, it might just be easier to read the entire text with
>>
>> Document to text (which can normalise the EOL character)
>
> I took Keisuke’s advice to heart and looked closer at Document to
> text. I did not realize that it provided a means to handle any EOL
> confusion between Mac and Windows. For me this makes the use of
> Document to text a much better way to read the contents of a document
> by row than Receive packet.
>
> In an effort to duplicate Receive packet’s functionality when used
> in a repeat loop to get rows of a tab delimited text document, I
> created a wrapper method as a replacement for Receive packet.
>
> I am posting this method here in the event someone on on the NUG
> might find it useful. More importantly for me, however, perhaps to
> get some feedback with regard to anything I may have overlooked or
> anything I may need to do to clean it up.
>
> It was written in a v14 database so does not use any of the newer
> features in v17 and later like Is Windows and/or objects.
>
> Thanks for any feedback.
>
> John
>
>
> ------------------------------
>
> // Method: ReceivePacket_DocToText
> (->pathToDocumentVariable;->VariableToHoldDocContents;BreakMode)
> // ----------------------------------------------------
> // Created by: John Baughman
> // ----------------------------------------------------
> // Description
> //Replacement for Receive Packet using Document To Text
>
> // Parameters
> C_POINTER($1;$pathPtr) //$1 is a pointer to the variable holding the
> path to the document.
> //In a repeat loop you can put an empty string in the variable to
> allow the user to pick the document using Document Select.
> //The variable will be updated to contain the chosen path for
> subsequent calls for rows in the loop.
> //If the user cancels the Select Document, $rowText:="", and
> $documentTextPtr->:=“"
>
> C_POINTER($2;$documentTextPtr) //Pass an empty text variable in the
> first call and it will be loaded from the document with Document to
> text.
> //The variable will be loaded with the document contents minus the
> row being returned.
>
> C_TEXT($0;$rowText) //$RowText will hold the first row in the
> $DocumentTextPtr variable and returned in $0.
> //The first row in the $DocumentTextPtr variable wil be removed.
>
> C_LONGINT($3;$breakMode) //Document to Text will convert the EOL
> character to the following desired break mode constants...
> //Document with CR
> //Document with CRLF
> //Document with LF
> //Document with native format
>
> //Example Call
> If (False)
> C_TEXT($TextValue;$path)
> $TextValue:=""
> $path:=""
> Repeat
> $rowText:=ReceivePacket_DocToText (->$path;->$TextValue;Document
> with CR)
> //If TextValue is “” then the text will be loaded from the
> document and returned minus the first row in TextValue
> //Subsequent calls will return first row in $rowText and return the
> text minus the first row in TextValue.
> If (rowText#””)
> //handle the row
> End if
> Until (TextValue =“") //note: I do not use the ok variable as I
> found that the ok variable may get set incorrectly by 4D if the last
> row does not have CR.
>
> End if
>
> // ----------------------------------------------------
>
> $pathPtr:=$1
> $DocumentTextPtr:=$2
> $breakMode:=$3
> $rowText:=""
>
> If ($pathPtr->="")
> //They want to select the document
> ARRAY TEXT($aSelected;0)
> $path:=Select document("";"*";"";0;$aSelected)
>
> If (ok=1)
> $pathPtr->:=$aSelected{1}
>
> End if
>
> End if
>
> If (ok=1)
>
> If ($DocumentTextPtr->="")
> //Document has not yet been loaded
> If (Is Windows)
> $documentTextPtr->:=Document to
> text($pathPtr->;"ANSI_X3.4-1986";$breakMode)
> $stopCharacter:="\r\n"
>
> Else
> $documentTextPtr->:=Document to
> text($pathPtr->;"MacRoman";$breakMode)
> $stopCharacter:="\r"
>
> End if
>
> //else document has been loaded use text in DocumentTextPtr
>
> End if
>
> $found:=False
>
> Case of
>
> : ($breakMode=Document with CR)
> $stopCharacter:="\r"
>
> : ($breakMode=Document with LF)
> $stopCharacter:="\n"
>
> : ($breakMode=Document with CRLF)
> $stopCharacter:="\r\n"
>
> //else
> //$stopCharacter was set to the default for the
> platform above
> following document to text
>
> End case
>
>
> If (Position($stopCharacter;$DocumentTextPtr->)=0)
> //Assumes that this is the last row without an EOL character.
> Otherwise it is an error
> $rowText:=$documentTextPtr-> //return what is left
> ok:=0 //we are done. Recieve packet sets ok to 0 when done in
> a
> repeat loop so I am doing the same here. I have found the ok variable
> to be unreliable in this method
> $documentTextPtr->:=“” //nothing left
>
> Else
>
> $rowText:=Substring($documentTextPtr->;1;Position($stopCharacter;$documentTextPtr->)-1)
>
> //put the first row without the stop character in $rowText
> $documentTextPtr->:=Replace
> string($documentTextPtr->;$rowText+$stopCharacter;””). //return
> the document text minus the first row in documentTextPtr
>
> End if
>
> Else
> $DocumentTextPtr->:=""
>
> End if
>
> $0:=$rowText. //return the first row
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> Archive: http://lists.4d.com/archives.html
> Options: https://lists.4d.com/mailman/options/4d_tech
> Unsub: mailto:[email protected]
> **********************************************************************
------------
Hell is other people
Jean-Paul Sartre
**********************************************************************
4D Internet Users Group (4D iNUG)
Archive: http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub: mailto:[email protected]
**********************************************************************