Re: Using Document to text ala Receive Packet Was: Receive packet stop character

Chip Scheide via 4D_Tech Thu, 23 Apr 2020 21:15:39 -0700

John,
unless you are reading data from a stream - 
why don't you just read the entire document into memory, THEN parse it?


every reference to the document (on disk) will take longer - by a factor of up 
to 1000 - than referencing the same information from memory. (spinning disks 
access in milliseconds, memory in nano seconds, SSDs are in between).

so read the entire document -
examine the text until you find a CR, LF or combination - 
if you find one check the next and/or previous character for the other.
now you know your EOL. 
You can not depend on the platform for the EOL determination.
A File created on a Mac can be read on a PC and vise verse.

if you need to process 1 'line' at a time just step through the text ending on 
the EOL from above.
you can check the size of the text, or the counter/pointer into your text 
(depending on your processing method) to determine if yo have processed 
everything.

something like this (assumes document in text var):
Last_Char_Counter:=0
repeat
        local_text := get_text_to_next_EOL(TextVar; -> Last_Char_Counter)
        if (local_text #"")
                process text
        end if
until (length(local_text) < 5) | (Last_Char_Counter + 5>=length(TextVar))

5 is to account for various extra characters I have often found at the end of 
documents, like multiple Cr/CrLf etc, YMMV on this number or it's need.

Chip


>> On Mar 25, 2020, at 8:21 PM, Keisuke Miyako via 4D_Tech 
>> <4d_tech@lists.4d.com> wrote:
>> 
>> sometimes, it might just be easier to read the entire text with
>> 
>> Document to text (which can normalise the EOL character)
> 
> I took Keisuke’s advice to heart and looked closer at Document to 
> text. I did not realize that it provided a means to handle any EOL 
> confusion between Mac and Windows. For me this makes the use of 
> Document to text a much better way to read the contents of a document 
> by row than Receive packet. 
> 
>  In an effort to duplicate Receive packet’s functionality when used 
> in a repeat loop to get rows of a tab delimited text document, I 
> created a wrapper method as a replacement for Receive packet.
> 
> I am posting this method here in the event someone on on the NUG 
> might find it useful. More importantly for me, however, perhaps to 
> get some feedback with regard to anything I may have overlooked or 
> anything I may need to do to clean it up.
> 
> It was written in a v14 database so does not use any of the newer 
> features in v17 and later like Is Windows and/or objects.
> 
>  Thanks for any feedback.
> 
> John
> 
> 
> ------------------------------
> 
>   // Method: ReceivePacket_DocToText 
> (->pathToDocumentVariable;->VariableToHoldDocContents;BreakMode)
>   // ----------------------------------------------------
>   // Created by: John Baughman
>   // ----------------------------------------------------
>   // Description
>   //Replacement for Receive Packet using Document To Text
> 
>   // Parameters
> C_POINTER($1;$pathPtr)  //$1 is a pointer to the variable holding the 
> path to the document. 
>   //In a repeat loop you can put an empty string in the variable to 
> allow the user to pick the document using Document Select.
>   //The variable will be updated to contain the chosen path for 
> subsequent calls for rows in the loop. 
>   //If the user cancels the Select Document, $rowText:="", and 
> $documentTextPtr->:=“"
> 
> C_POINTER($2;$documentTextPtr)  //Pass an empty text variable in the 
> first call and it will be loaded from the document with Document to 
> text.
>   //The variable will be loaded with the document contents minus the 
> row being returned.
> 
> C_TEXT($0;$rowText)  //$RowText will hold the first row in the 
> $DocumentTextPtr variable and returned in $0. 
>   //The first row in the $DocumentTextPtr variable wil be removed.
> 
> C_LONGINT($3;$breakMode)  //Document to Text will convert the EOL 
> character to the following desired break mode constants...
>   //Document with CR
>   //Document with CRLF
>   //Document with LF
>   //Document with native format
> 
>   //Example Call
> If (False)
>   C_TEXT($TextValue;$path)
>   $TextValue:=""
>   $path:=""
>   Repeat 
>     $rowText:=ReceivePacket_DocToText (->$path;->$TextValue;Document 
> with CR)
>        //If TextValue is “” then the text will be loaded from the 
> document and returned minus the first row in TextValue
>        //Subsequent calls will return first row in $rowText and return the 
> text minus the first row in TextValue.
>       If (rowText#””)
>          //handle the row
>       End if
>   Until (TextValue =“")   //note: I do not use the ok variable as I 
> found that the ok variable may get set incorrectly by 4D if the last 
> row does not have CR.
> 
> End if 
> 
>   // ----------------------------------------------------
> 
> $pathPtr:=$1
> $DocumentTextPtr:=$2
> $breakMode:=$3
> $rowText:=""
> 
> If ($pathPtr->="")
>         //They want to select the document
>       ARRAY TEXT($aSelected;0)
>       $path:=Select document("";"*";"";0;$aSelected)
>       
>       If (ok=1)
>               $pathPtr->:=$aSelected{1}
>               
>       End if 
>       
> End if 
> 
> If (ok=1)
>       
>       If ($DocumentTextPtr->="")
>                 //Document has not yet been loaded
>               If (Is Windows)
>                       $documentTextPtr->:=Document to 
> text($pathPtr->;"ANSI_X3.4-1986";$breakMode)
>                       $stopCharacter:="\r\n"
>                       
>               Else 
>                       $documentTextPtr->:=Document to 
> text($pathPtr->;"MacRoman";$breakMode)
>                       $stopCharacter:="\r"
>                       
>               End if 
> 
>          //else document has been loaded use text in DocumentTextPtr
>               
>       End if 
>       
>       $found:=False
>       
>       Case of 
>                       
>               : ($breakMode=Document with CR)
>                       $stopCharacter:="\r"
>                       
>               : ($breakMode=Document with LF)
>                       $stopCharacter:="\n"
>                       
>               : ($breakMode=Document with CRLF)
>                       $stopCharacter:="\r\n"
>                       
>                         //else
>                         //$stopCharacter was set to the default for the 
> platform above 
> following document to text
>                       
>       End case 
>       
>       
>       If (Position($stopCharacter;$DocumentTextPtr->)=0)
>                 //Assumes that this is the last row without an EOL character. 
> Otherwise it is an error
>               $rowText:=$documentTextPtr->  //return what is left
>               ok:=0  //we are done. Recieve packet sets ok to 0 when done in 
> a 
> repeat loop so I am doing the same here. I have found the ok variable 
> to be unreliable in this method
>               $documentTextPtr->:=“”    //nothing left
>               
>       Else 
>               
> $rowText:=Substring($documentTextPtr->;1;Position($stopCharacter;$documentTextPtr->)-1)
>   
> //put the first row without the stop character in $rowText
>               $documentTextPtr->:=Replace 
> string($documentTextPtr->;$rowText+$stopCharacter;””).   //return 
> the document text minus the first row in documentTextPtr
>               
>       End if 
>       
> Else 
>       $DocumentTextPtr->:=""
>       
> End if 
> 
> $0:=$rowText.  //return the first row
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> Archive:  http://lists.4d.com/archives.html
> Options: https://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **********************************************************************
------------
Hell is other people 
     Jean-Paul Sartre
**********************************************************************
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**********************************************************************

Re: Using Document to text ala Receive Packet Was: Receive packet stop character

Reply via email to