Re: v12+ parsing text

Arnaud de Montard Thu, 17 Nov 2016 08:43:36 -0800

> Le 17 nov. 2016 à 16:07, Chip Scheide <[email protected]> a écrit :
> 
> 
> Thanks.
> 
> the problem Iam having is not from disk to memory
> I read the disk file into a text array, limiting each array element to 
> 1.5g (not that I have really had something that big, it is a hold over 
> from pre v11 where 32k characters was a text var/field limit).


I understand that you keep the whole document in an array, right?
If so, I know that splitting a huge text in a text array makes it much easier 
to manipulate, but, at the end, 4D memory is the same. In my example of 6,6Gb 
file, it was not a solution. 

That said, I read your 1st message too fast (as usual). Seems your document is 
not so huge (billion is not used in french, I always mistake)

For smaller documents, I used in v12 a wrapper for 'document to text':

***
 //FS_documentToText (path_t {;charSet_t {;lineEnd_l) -> txt
C_TEXT($0;$1)
$doc_t:=$1
$charSet_t:="utf-8"
If ($params_l>1)
$charSet_t:=$2
End if
DOCUMENT TO BLOB($doc_t;$data_x)
$0:=Convert to text($data_x;$charSet_t)
***

Another thing I read too fast is about using position/substring/truncating. 
Since v11 this can be avoided with the 2 Position parameters (that changed my 
life):
- start from
- * (at end)

The classical "delimited text to array" becomes quite simple:
<http://forums.4d.fr/Post/FR/11429115/1/11429116#11429116>

And reading a csv file too:
***
$data_t:=FS_documentToText (path_t;"utf-8")
array text($line_at;0)
explode(->$line_at;$data_t;"\r")
array text(field_at;0)
For($i;1;size of array($line_at))
 explode(->field_at;$line_at{$i};",")
end for
***

-- 
Arnaud de Montard 



**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[email protected]
**********************************************************************

Re: v12+ parsing text

Reply via email to