Re: [PD] Fastest way to find lines in text file

2017-03-23 Thread Jack
Le 23/03/2017 à 09:24, IOhannes m zmoelnig a écrit :
> On 2017-03-22 16:55, Jack wrote:
>> Le 22/03/2017 à 16:41, Christof Ressi a écrit :
>>> does it *really* have to be faster than 40ms? what are you trying to do? do 
>>> you *need* the output in 0 logical time? depending on the situation you 
>>> might want to spread the computation across multiple audio blocks or if you 
>>> don't care about determinism, have the file in another instance of pd and 
>>> communicate with netsend/netreceive (one instance makes a request and the 
>>> other instance sends the result once the search is finished). 
>>
>> Yes, i need 0 logical time, 
> 
> if you are mainly concerend about logical time, then everything is good:
> 0 logical time can take 200ms real time and more.

Sure.
But i would be more happy if this 0 logical time take 2 min instead of
10 min !
++

Jack


> 
> gmsdfrt
> IOhannes
> 
> 
> 
> ___
> Pd-list@lists.iem.at mailing list
> UNSUBSCRIBE and account-management -> 
> https://lists.puredata.info/listinfo/pd-list
> 




signature.asc
Description: OpenPGP digital signature
___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-23 Thread IOhannes m zmoelnig
On 2017-03-22 16:55, Jack wrote:
> Le 22/03/2017 à 16:41, Christof Ressi a écrit :
>> does it *really* have to be faster than 40ms? what are you trying to do? do 
>> you *need* the output in 0 logical time? depending on the situation you 
>> might want to spread the computation across multiple audio blocks or if you 
>> don't care about determinism, have the file in another instance of pd and 
>> communicate with netsend/netreceive (one instance makes a request and the 
>> other instance sends the result once the search is finished). 
> 
> Yes, i need 0 logical time, 

if you are mainly concerend about logical time, then everything is good:
0 logical time can take 200ms real time and more.

gmsdfrt
IOhannes



signature.asc
Description: OpenPGP digital signature
___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Jack
Le 22/03/2017 à 17:10, cyrille henry a écrit :
> 
> 
> Le 22/03/2017 à 17:01, Jack a écrit :
>> Good Idea !
>> Just need to order the textfile (In fact, the file is not totally
>> ordered) ;)
>> Thanx.
>> Speaking on this topic, give me a new idea on the good method to
>> adopt. :)
> 
> since you can do it in a non real time way, I think python have a sort
> function that can do this easily.
> or try with libre office.

Or command line :
$ sort -k1 -g linksIdOK.txt
++

Jack


> 
> cheers
> c
> 
>> ++
>>
>> Jack
>>
>>
>>
>> Le 22/03/2017 à 16:46, cyrille henry a écrit :
>>> if you textfile is composed of 2 row of number you can optimize the
>>> search with prior treatment.
>>>
>>> 1 : order the index column (already done in your example)
>>> 2 : create 2 table of start index, and number of occurrence of this
>>> index
>>> in you example, the "start index table" would be 0 at 345594, 5 at
>>> 345595, 15 at 345596, 16 at 345598
>>> the "number of occurrence index table" would be : 5 at 345594, 10 at
>>> 345595, 1 at 345596, 4 at 345598
>>> 3 : put column 2 of you textfile in a "data table"
>>>
>>> now, when searching for 345595, you just have to [tabread table1] and
>>> [tabread table2] at position 345595, and with a small until loop you
>>> just have to read the data table only where needed.
>>>
>>> cheers
>>> c
>>>
>>> Le 22/03/2017 à 14:34, Jack a écrit :
>>>> I guess my 2 precedent mails were enough clear.
>>>> But i will answer at each point :
>>>>
>>>> 1) My previous mails :
>>>> I need to find every lines of a textfile containing a word.
>>>> The textfile has 2.539.592 lines.
>>>> Now, i am using [msgfile] from zexy because i can find a line, skip a
>>>> line and find again ... until the end of the textfile.
>>>> But, i am wondering if there is an other object (in an other library)
>>>> faster, specialized in this work ?
>>>> ...
>>>> The textfile has only two "strings" by line.
>>>> Here, 20 lines of the textfile :
>>>>
>>>> 345594 577427
>>>> 345594 567267
>>>> 345594 528911
>>>> 345594 534435
>>>> 345594 523087
>>>> 345595 374384
>>>> 345595 377303
>>>> 345595 380544
>>>> 345595 379911
>>>> 345595 557020
>>>> 345595 552396
>>>> 345595 562487
>>>> 345595 460842
>>>> 345595 428449
>>>> 345595 424095
>>>> 345596 447676
>>>> 345598 579883
>>>> 345598 379495
>>>> 345598 379039
>>>> 345598 380328
>>>>
>>>> 2) See above
>>>> 3) See above
>>>> 4) See above
>>>> 5) Linux/Ubuntu 16.10/Pd 0.47.1
>>>> 6) you abuse :)
>>>>
>>>> ++
>>>>
>>>> Jack
>>>>
>>>>
>>>>
>>>>
>>>> Le 22/03/2017 à 13:31, Lorenzo Sutton a écrit :
>>>>> Hi,
>>>>>
>>>>> On 22/03/2017 13:01, Jack wrote:
>>>>>> I need to find all instances that math to the first row.
>>>>>> It is not possible with [text search] if i am right.
>>>>>
>>>>> I think you should outline your use case/problem in more detail. This
>>>>> should be a good practice when asking for support on the Mailing List.
>>>>>
>>>>> Example:
>>>>>
>>>>> 1) I have a text file where each line contains a two integers
>>>>> separated
>>>>> by a space (" ") char - such as (possibly paste a part of the file on
>>>>> pastebin or similar too).
>>>>> 213214 12313
>>>>> 123223 13213
>>>>>
>>>>> 2) My file is [always/at least/circa/ ...] 2,539,592 lines long
>>>>>
>>>>> 3) My algorithm should find all subsequent lines matching the first
>>>>> line
>>>>> in the file and return [all line numbers for matches / the total count
>>>>> of matched lines / ...]
>>>>>
>>>>> 3) I want the algorithm to be [as fast as possible / run in under 1
>>>>> second / run in under 1ms / ... ]
>>>>>
>>>>> 4) I [want to / do not need to] use Pd Vanilla
>>>&

Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread cyrille henry



Le 22/03/2017 à 17:01, Jack a écrit :

Good Idea !
Just need to order the textfile (In fact, the file is not totally
ordered) ;)
Thanx.
Speaking on this topic, give me a new idea on the good method to adopt. :)


since you can do it in a non real time way, I think python have a sort function 
that can do this easily.
or try with libre office.

cheers
c


++

Jack



Le 22/03/2017 à 16:46, cyrille henry a écrit :

if you textfile is composed of 2 row of number you can optimize the
search with prior treatment.

1 : order the index column (already done in your example)
2 : create 2 table of start index, and number of occurrence of this index
in you example, the "start index table" would be 0 at 345594, 5 at
345595, 15 at 345596, 16 at 345598
the "number of occurrence index table" would be : 5 at 345594, 10 at
345595, 1 at 345596, 4 at 345598
3 : put column 2 of you textfile in a "data table"

now, when searching for 345595, you just have to [tabread table1] and
[tabread table2] at position 345595, and with a small until loop you
just have to read the data table only where needed.

cheers
c

Le 22/03/2017 à 14:34, Jack a écrit :

I guess my 2 precedent mails were enough clear.
But i will answer at each point :

1) My previous mails :
I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
...
The textfile has only two "strings" by line.
Here, 20 lines of the textfile :

345594 577427
345594 567267
345594 528911
345594 534435
345594 523087
345595 374384
345595 377303
345595 380544
345595 379911
345595 557020
345595 552396
345595 562487
345595 460842
345595 428449
345595 424095
345596 447676
345598 579883
345598 379495
345598 379039
345598 380328

2) See above
3) See above
4) See above
5) Linux/Ubuntu 16.10/Pd 0.47.1
6) you abuse :)

++

Jack




Le 22/03/2017 à 13:31, Lorenzo Sutton a écrit :

Hi,

On 22/03/2017 13:01, Jack wrote:

I need to find all instances that math to the first row.
It is not possible with [text search] if i am right.


I think you should outline your use case/problem in more detail. This
should be a good practice when asking for support on the Mailing List.

Example:

1) I have a text file where each line contains a two integers separated
by a space (" ") char - such as (possibly paste a part of the file on
pastebin or similar too).
213214 12313
123223 13213

2) My file is [always/at least/circa/ ...] 2,539,592 lines long

3) My algorithm should find all subsequent lines matching the first line
in the file and return [all line numbers for matches / the total count
of matched lines / ...]

3) I want the algorithm to be [as fast as possible / run in under 1
second / run in under 1ms / ... ]

4) I [want to / do not need to] use Pd Vanilla

5) My patch should run on [All platforms / Windows / OSX / Linux / ...]

6) My patch should run [on potentially any machine / on a Raspberry Pi /
on a 1990s 386 machine / on my digital toaster where I have compiled a
custom version of Pd / ... ]

:)



++

Jack



Le 22/03/2017 à 08:27, Liam Goodacre a écrit :

You can also use [text search], although t's not so easy to find more
than the first instance. If you don't mind taking a extra step, you
could give each line a third term, which is the line number. Then you
can use the "> 3" argument for [text search] to find matches s





*From:* Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack
<j...@rybn.org>
*Sent:* 21 March 2017 18:14
*To:* pd-list@lists.iem.at
*Subject:* [PD] Fastest way to find lines in text file

Hello,

I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
Thanx.
++

Jack


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list




___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list



___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list



___
Pd-list@list

Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Jack
Good Idea !
Just need to order the textfile (In fact, the file is not totally
ordered) ;)
Thanx.
Speaking on this topic, give me a new idea on the good method to adopt. :)
++

Jack



Le 22/03/2017 à 16:46, cyrille henry a écrit :
> if you textfile is composed of 2 row of number you can optimize the
> search with prior treatment.
> 
> 1 : order the index column (already done in your example)
> 2 : create 2 table of start index, and number of occurrence of this index
> in you example, the "start index table" would be 0 at 345594, 5 at
> 345595, 15 at 345596, 16 at 345598
> the "number of occurrence index table" would be : 5 at 345594, 10 at
> 345595, 1 at 345596, 4 at 345598
> 3 : put column 2 of you textfile in a "data table"
> 
> now, when searching for 345595, you just have to [tabread table1] and
> [tabread table2] at position 345595, and with a small until loop you
> just have to read the data table only where needed.
> 
> cheers
> c
> 
> Le 22/03/2017 à 14:34, Jack a écrit :
>> I guess my 2 precedent mails were enough clear.
>> But i will answer at each point :
>>
>> 1) My previous mails :
>> I need to find every lines of a textfile containing a word.
>> The textfile has 2.539.592 lines.
>> Now, i am using [msgfile] from zexy because i can find a line, skip a
>> line and find again ... until the end of the textfile.
>> But, i am wondering if there is an other object (in an other library)
>> faster, specialized in this work ?
>> ...
>> The textfile has only two "strings" by line.
>> Here, 20 lines of the textfile :
>>
>> 345594 577427
>> 345594 567267
>> 345594 528911
>> 345594 534435
>> 345594 523087
>> 345595 374384
>> 345595 377303
>> 345595 380544
>> 345595 379911
>> 345595 557020
>> 345595 552396
>> 345595 562487
>> 345595 460842
>> 345595 428449
>> 345595 424095
>> 345596 447676
>> 345598 579883
>> 345598 379495
>> 345598 379039
>> 345598 380328
>>
>> 2) See above
>> 3) See above
>> 4) See above
>> 5) Linux/Ubuntu 16.10/Pd 0.47.1
>> 6) you abuse :)
>>
>> ++
>>
>> Jack
>>
>>
>>
>>
>> Le 22/03/2017 à 13:31, Lorenzo Sutton a écrit :
>>> Hi,
>>>
>>> On 22/03/2017 13:01, Jack wrote:
>>>> I need to find all instances that math to the first row.
>>>> It is not possible with [text search] if i am right.
>>>
>>> I think you should outline your use case/problem in more detail. This
>>> should be a good practice when asking for support on the Mailing List.
>>>
>>> Example:
>>>
>>> 1) I have a text file where each line contains a two integers separated
>>> by a space (" ") char - such as (possibly paste a part of the file on
>>> pastebin or similar too).
>>> 213214 12313
>>> 123223 13213
>>>
>>> 2) My file is [always/at least/circa/ ...] 2,539,592 lines long
>>>
>>> 3) My algorithm should find all subsequent lines matching the first line
>>> in the file and return [all line numbers for matches / the total count
>>> of matched lines / ...]
>>>
>>> 3) I want the algorithm to be [as fast as possible / run in under 1
>>> second / run in under 1ms / ... ]
>>>
>>> 4) I [want to / do not need to] use Pd Vanilla
>>>
>>> 5) My patch should run on [All platforms / Windows / OSX / Linux / ...]
>>>
>>> 6) My patch should run [on potentially any machine / on a Raspberry Pi /
>>> on a 1990s 386 machine / on my digital toaster where I have compiled a
>>> custom version of Pd / ... ]
>>>
>>> :)
>>>
>>>
>>>> ++
>>>>
>>>> Jack
>>>>
>>>>
>>>>
>>>> Le 22/03/2017 à 08:27, Liam Goodacre a écrit :
>>>>> You can also use [text search], although t's not so easy to find more
>>>>> than the first instance. If you don't mind taking a extra step, you
>>>>> could give each line a third term, which is the line number. Then you
>>>>> can use the "> 3" argument for [text search] to find matches s
>>>>>
>>>>>
>>>>>
>>>>> 
>>>>>
>>>>> *From:* Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack
>>>>> <j...@rybn.org>
>>>>> *Sent:* 21 March 2017 18:1

Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Jack
Le 22/03/2017 à 16:41, Christof Ressi a écrit :
> you're right, it's indeed slower, I just tested it. implementing the array 
> traversal in pd leads to many redundant operations, which slow the thing 
> down. but it should be quite easy to implement it as external. if you haven't 
> written externals yet, see it as a challenge :-).
> 
> but the thing I'm rather wondering is: 
> 
> does it *really* have to be faster than 40ms? what are you trying to do? do 
> you *need* the output in 0 logical time? depending on the situation you might 
> want to spread the computation across multiple audio blocks or if you don't 
> care about determinism, have the file in another instance of pd and 
> communicate with netsend/netreceive (one instance makes a request and the 
> other instance sends the result once the search is finished). 

Yes, i need 0 logical time, because i want to draw a graphical map in
Gem and save it as picture. There is a lot of element to draw.
++

Jack



> 
> Christof
> 
>> Gesendet: Mittwoch, 22. März 2017 um 12:54 Uhr
>> Von: Jack <j...@rybn.org>
>> An: "Christof Ressi" <christof.re...@gmx.at>
>> Cc: pd-list@lists.iem.at
>> Betreff: Re: Aw: Re: [PD] Fastest way to find lines in text file
>>
>> Le 21/03/2017 à 22:16, Christof Ressi a écrit :
>>>> I need to find every lines of a textfile containing a word.
>>>
>>> that sentence is quite ambiguous, hehe. 
>>
>> We can talk about interger.
>>
>>> so we're talking about integers. I'd load the textfile with [text] and then 
>>> use [text sequence] to spit out all the lines and copy them to a table with 
>>> 2 * 2.539.592 elements. looping through the table and finding your words 
>>> should be straightforward and much faster.
>>
>> The problem with this solution (if i am right), you need to store the
>> second element somewhere to return the line (e.g. 345594 577427) when
>> the first match to your interger (i just need to test on the first element).
>> I test it with 345594 (5 lines match to this interger) and [realtime]
>> return 200 ms to execute it (with msgfile, it is 40 ms). Even, if you
>> don't store the second interger to return the whole line, you need 70 ms
>> (with [array get], [drip] from zexy and [route] or [select]).
>> Maybe i miss something ?
>> ++
>>
>> Jack
>>
>>
>>
>>>
>>> Christof
>>>
>>>> Gesendet: Dienstag, 21. März 2017 um 19:20 Uhr
>>>> Von: Jack <j...@rybn.org>
>>>> An: pd-list@lists.iem.at
>>>> Betreff: Re: [PD] Fastest way to find lines in text file
>>>>
>>>> The textfile has only two "string" by lines.
>>>> Here, 20 lines of the textfile :
>>>>
>>>> 345594 577427
>>>> 345594 567267
>>>> 345594 528911
>>>> 345594 534435
>>>> 345594 523087
>>>> 345595 374384
>>>> 345595 377303
>>>> 345595 380544
>>>> 345595 379911
>>>> 345595 557020
>>>> 345595 552396
>>>> 345595 562487
>>>> 345595 460842
>>>> 345595 428449
>>>> 345595 424095
>>>> 345596 447676
>>>> 345598 579883
>>>> 345598 379495
>>>> 345598 379039
>>>> 345598 380328
>>>> ++
>>>>
>>>> Jack
>>>>
>>>>
>>>> Le 21/03/2017 à 19:14, Jack a écrit :
>>>>> Hello,
>>>>>
>>>>> I need to find every lines of a textfile containing a word.
>>>>> The textfile has 2.539.592 lines.
>>>>> Now, i am using [msgfile] from zexy because i can find a line, skip a
>>>>> line and find again ... until the end of the textfile.
>>>>> But, i am wondering if there is an other object (in an other library)
>>>>> faster, specialized in this work ?
>>>>> Thanx.
>>>>> ++
>>>>>
>>>>> Jack
>>>>>
>>>>>
>>>>> ___
>>>>> Pd-list@lists.iem.at mailing list
>>>>> UNSUBSCRIBE and account-management -> 
>>>>> https://lists.puredata.info/listinfo/pd-list
>>>>>
>>>>
>>>>
>>>> ___
>>>> Pd-list@lists.iem.at mailing list
>>>> UNSUBSCRIBE and account-management -> 
>>>> https://lists.puredata.info/listinfo/pd-list
>>>>
>>
>>


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread cyrille henry

if you textfile is composed of 2 row of number you can optimize the search with 
prior treatment.

1 : order the index column (already done in your example)
2 : create 2 table of start index, and number of occurrence of this index
in you example, the "start index table" would be 0 at 345594, 5 at 345595, 15 
at 345596, 16 at 345598
the "number of occurrence index table" would be : 5 at 345594, 10 at 345595, 1 
at 345596, 4 at 345598
3 : put column 2 of you textfile in a "data table"

now, when searching for 345595, you just have to [tabread table1] and [tabread 
table2] at position 345595, and with a small until loop you just have to read 
the data table only where needed.

cheers
c

Le 22/03/2017 à 14:34, Jack a écrit :

I guess my 2 precedent mails were enough clear.
But i will answer at each point :

1) My previous mails :
I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
...
The textfile has only two "strings" by line.
Here, 20 lines of the textfile :

345594 577427
345594 567267
345594 528911
345594 534435
345594 523087
345595 374384
345595 377303
345595 380544
345595 379911
345595 557020
345595 552396
345595 562487
345595 460842
345595 428449
345595 424095
345596 447676
345598 579883
345598 379495
345598 379039
345598 380328

2) See above
3) See above
4) See above
5) Linux/Ubuntu 16.10/Pd 0.47.1
6) you abuse :)

++

Jack




Le 22/03/2017 à 13:31, Lorenzo Sutton a écrit :

Hi,

On 22/03/2017 13:01, Jack wrote:

I need to find all instances that math to the first row.
It is not possible with [text search] if i am right.


I think you should outline your use case/problem in more detail. This
should be a good practice when asking for support on the Mailing List.

Example:

1) I have a text file where each line contains a two integers separated
by a space (" ") char - such as (possibly paste a part of the file on
pastebin or similar too).
213214 12313
123223 13213

2) My file is [always/at least/circa/ ...] 2,539,592 lines long

3) My algorithm should find all subsequent lines matching the first line
in the file and return [all line numbers for matches / the total count
of matched lines / ...]

3) I want the algorithm to be [as fast as possible / run in under 1
second / run in under 1ms / ... ]

4) I [want to / do not need to] use Pd Vanilla

5) My patch should run on [All platforms / Windows / OSX / Linux / ...]

6) My patch should run [on potentially any machine / on a Raspberry Pi /
on a 1990s 386 machine / on my digital toaster where I have compiled a
custom version of Pd / ... ]

:)



++

Jack



Le 22/03/2017 à 08:27, Liam Goodacre a écrit :

You can also use [text search], although t's not so easy to find more
than the first instance. If you don't mind taking a extra step, you
could give each line a third term, which is the line number. Then you
can use the "> 3" argument for [text search] to find matches s




*From:* Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack
<j...@rybn.org>
*Sent:* 21 March 2017 18:14
*To:* pd-list@lists.iem.at
*Subject:* [PD] Fastest way to find lines in text file

Hello,

I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
Thanx.
++

Jack


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list




___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list



___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list



___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list



___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Christof Ressi
you're right, it's indeed slower, I just tested it. implementing the array 
traversal in pd leads to many redundant operations, which slow the thing down. 
but it should be quite easy to implement it as external. if you haven't written 
externals yet, see it as a challenge :-).

but the thing I'm rather wondering is: 

does it *really* have to be faster than 40ms? what are you trying to do? do you 
*need* the output in 0 logical time? depending on the situation you might want 
to spread the computation across multiple audio blocks or if you don't care 
about determinism, have the file in another instance of pd and communicate with 
netsend/netreceive (one instance makes a request and the other instance sends 
the result once the search is finished). 

Christof

> Gesendet: Mittwoch, 22. März 2017 um 12:54 Uhr
> Von: Jack <j...@rybn.org>
> An: "Christof Ressi" <christof.re...@gmx.at>
> Cc: pd-list@lists.iem.at
> Betreff: Re: Aw: Re: [PD] Fastest way to find lines in text file
>
> Le 21/03/2017 à 22:16, Christof Ressi a écrit :
> >> I need to find every lines of a textfile containing a word.
> > 
> > that sentence is quite ambiguous, hehe. 
> 
> We can talk about interger.
> 
> > so we're talking about integers. I'd load the textfile with [text] and then 
> > use [text sequence] to spit out all the lines and copy them to a table with 
> > 2 * 2.539.592 elements. looping through the table and finding your words 
> > should be straightforward and much faster.
> 
> The problem with this solution (if i am right), you need to store the
> second element somewhere to return the line (e.g. 345594 577427) when
> the first match to your interger (i just need to test on the first element).
> I test it with 345594 (5 lines match to this interger) and [realtime]
> return 200 ms to execute it (with msgfile, it is 40 ms). Even, if you
> don't store the second interger to return the whole line, you need 70 ms
> (with [array get], [drip] from zexy and [route] or [select]).
> Maybe i miss something ?
> ++
> 
> Jack
> 
> 
> 
> > 
> > Christof
> > 
> >> Gesendet: Dienstag, 21. März 2017 um 19:20 Uhr
> >> Von: Jack <j...@rybn.org>
> >> An: pd-list@lists.iem.at
> >> Betreff: Re: [PD] Fastest way to find lines in text file
> >>
> >> The textfile has only two "string" by lines.
> >> Here, 20 lines of the textfile :
> >>
> >> 345594 577427
> >> 345594 567267
> >> 345594 528911
> >> 345594 534435
> >> 345594 523087
> >> 345595 374384
> >> 345595 377303
> >> 345595 380544
> >> 345595 379911
> >> 345595 557020
> >> 345595 552396
> >> 345595 562487
> >> 345595 460842
> >> 345595 428449
> >> 345595 424095
> >> 345596 447676
> >> 345598 579883
> >> 345598 379495
> >> 345598 379039
> >> 345598 380328
> >> ++
> >>
> >> Jack
> >>
> >>
> >> Le 21/03/2017 à 19:14, Jack a écrit :
> >>> Hello,
> >>>
> >>> I need to find every lines of a textfile containing a word.
> >>> The textfile has 2.539.592 lines.
> >>> Now, i am using [msgfile] from zexy because i can find a line, skip a
> >>> line and find again ... until the end of the textfile.
> >>> But, i am wondering if there is an other object (in an other library)
> >>> faster, specialized in this work ?
> >>> Thanx.
> >>> ++
> >>>
> >>> Jack
> >>>
> >>>
> >>> ___
> >>> Pd-list@lists.iem.at mailing list
> >>> UNSUBSCRIBE and account-management -> 
> >>> https://lists.puredata.info/listinfo/pd-list
> >>>
> >>
> >>
> >> ___
> >> Pd-list@lists.iem.at mailing list
> >> UNSUBSCRIBE and account-management -> 
> >> https://lists.puredata.info/listinfo/pd-list
> >>
> 
>

___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Jack
Thanx Martin for this proposition. Yes I think Pdlua and Pyext are good
for this purpose.
++

Jack



Le 22/03/2017 à 16:22, Martin Peach a écrit :
> On Wed, Mar 22, 2017 at 9:34 AM, Jack  > wrote:
>  
> 
> ...
> 
> But, i am wondering if there is an other object (in an other library)
> faster, specialized in this work ?
> 
> 
> [pdlua] is excellent for scanning text.
> 
> Martin
> 
> 


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Martin Peach
On Wed, Mar 22, 2017 at 9:34 AM, Jack  wrote:


> ...
>
But, i am wondering if there is an other object (in an other library)
> faster, specialized in this work ?
>

[pdlua] is excellent for scanning text.

Martin
___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Jack
I guess my 2 precedent mails were enough clear.
But i will answer at each point :

1) My previous mails :
I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
...
The textfile has only two "strings" by line.
Here, 20 lines of the textfile :

345594 577427
345594 567267
345594 528911
345594 534435
345594 523087
345595 374384
345595 377303
345595 380544
345595 379911
345595 557020
345595 552396
345595 562487
345595 460842
345595 428449
345595 424095
345596 447676
345598 579883
345598 379495
345598 379039
345598 380328

2) See above
3) See above
4) See above
5) Linux/Ubuntu 16.10/Pd 0.47.1
6) you abuse :)

++

Jack




Le 22/03/2017 à 13:31, Lorenzo Sutton a écrit :
> Hi,
> 
> On 22/03/2017 13:01, Jack wrote:
>> I need to find all instances that math to the first row.
>> It is not possible with [text search] if i am right.
> 
> I think you should outline your use case/problem in more detail. This
> should be a good practice when asking for support on the Mailing List.
> 
> Example:
> 
> 1) I have a text file where each line contains a two integers separated
> by a space (" ") char - such as (possibly paste a part of the file on
> pastebin or similar too).
> 213214 12313
> 123223 13213
> 
> 2) My file is [always/at least/circa/ ...] 2,539,592 lines long
> 
> 3) My algorithm should find all subsequent lines matching the first line
> in the file and return [all line numbers for matches / the total count
> of matched lines / ...]
> 
> 3) I want the algorithm to be [as fast as possible / run in under 1
> second / run in under 1ms / ... ]
> 
> 4) I [want to / do not need to] use Pd Vanilla
> 
> 5) My patch should run on [All platforms / Windows / OSX / Linux / ...]
> 
> 6) My patch should run [on potentially any machine / on a Raspberry Pi /
> on a 1990s 386 machine / on my digital toaster where I have compiled a
> custom version of Pd / ... ]
> 
> :)
> 
> 
>> ++
>>
>> Jack
>>
>>
>>
>> Le 22/03/2017 à 08:27, Liam Goodacre a écrit :
>>> You can also use [text search], although t's not so easy to find more
>>> than the first instance. If you don't mind taking a extra step, you
>>> could give each line a third term, which is the line number. Then you
>>> can use the "> 3" argument for [text search] to find matches s
>>>
>>>
>>>
>>> ----
>>> *From:* Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack
>>> <j...@rybn.org>
>>> *Sent:* 21 March 2017 18:14
>>> *To:* pd-list@lists.iem.at
>>> *Subject:* [PD] Fastest way to find lines in text file
>>>
>>> Hello,
>>>
>>> I need to find every lines of a textfile containing a word.
>>> The textfile has 2.539.592 lines.
>>> Now, i am using [msgfile] from zexy because i can find a line, skip a
>>> line and find again ... until the end of the textfile.
>>> But, i am wondering if there is an other object (in an other library)
>>> faster, specialized in this work ?
>>> Thanx.
>>> ++
>>>
>>> Jack
>>>
>>>
>>> ___
>>> Pd-list@lists.iem.at mailing list
>>> UNSUBSCRIBE and account-management ->
>>> https://lists.puredata.info/listinfo/pd-list
>>>
>>>
>>> ___
>>> Pd-list@lists.iem.at mailing list
>>> UNSUBSCRIBE and account-management ->
>>> https://lists.puredata.info/listinfo/pd-list
>>>
>>
>>
>> ___
>> Pd-list@lists.iem.at mailing list
>> UNSUBSCRIBE and account-management ->
>> https://lists.puredata.info/listinfo/pd-list
>>
> 
> ___
> Pd-list@lists.iem.at mailing list
> UNSUBSCRIBE and account-management ->
> https://lists.puredata.info/listinfo/pd-list


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Lorenzo Sutton

Hi,

On 22/03/2017 13:01, Jack wrote:

I need to find all instances that math to the first row.
It is not possible with [text search] if i am right.


I think you should outline your use case/problem in more detail. This 
should be a good practice when asking for support on the Mailing List.


Example:

1) I have a text file where each line contains a two integers separated 
by a space (" ") char - such as (possibly paste a part of the file on 
pastebin or similar too).

213214 12313
123223 13213

2) My file is [always/at least/circa/ ...] 2,539,592 lines long

3) My algorithm should find all subsequent lines matching the first line 
in the file and return [all line numbers for matches / the total count 
of matched lines / ...]


3) I want the algorithm to be [as fast as possible / run in under 1 
second / run in under 1ms / ... ]


4) I [want to / do not need to] use Pd Vanilla

5) My patch should run on [All platforms / Windows / OSX / Linux / ...]

6) My patch should run [on potentially any machine / on a Raspberry Pi / 
on a 1990s 386 machine / on my digital toaster where I have compiled a 
custom version of Pd / ... ]


:)



++

Jack



Le 22/03/2017 à 08:27, Liam Goodacre a écrit :

You can also use [text search], although t's not so easy to find more
than the first instance. If you don't mind taking a extra step, you
could give each line a third term, which is the line number. Then you
can use the "> 3" argument for [text search] to find matches s




*From:* Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack
<j...@rybn.org>
*Sent:* 21 March 2017 18:14
*To:* pd-list@lists.iem.at
*Subject:* [PD] Fastest way to find lines in text file

Hello,

I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
Thanx.
++

Jack


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list




___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list



___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Jack
:) I remenmber that gedit was buggy...

Your proposition with until/textfile/select is slower than msgfile. (200
ms vs 40 ms with [msgfile] and its "find" method).

But what you mean by "you just need to add a counter" ?
++

Jack



Le 22/03/2017 à 09:47, cyrille henry a écrit :
> few years ago, i used "until / textfile / select" for search and replace
> in huge text because gedit was to slow.
> i'll do it again if I had to deal with this kind of texts.
> 
> for your application, you just need to add a counter.
> c
> 
> 
> Le 21/03/2017 à 19:14, Jack a écrit :
>> Hello,
>>
>> I need to find every lines of a textfile containing a word.
>> The textfile has 2.539.592 lines.
>> Now, i am using [msgfile] from zexy because i can find a line, skip a
>> line and find again ... until the end of the textfile.
>> But, i am wondering if there is an other object (in an other library)
>> faster, specialized in this work ?
>> Thanx.
>> ++
>>
>> Jack
>>
>>
>> ___
>> Pd-list@lists.iem.at mailing list
>> UNSUBSCRIBE and account-management ->
>> https://lists.puredata.info/listinfo/pd-list
>>
> 
> ___
> Pd-list@lists.iem.at mailing list
> UNSUBSCRIBE and account-management ->
> https://lists.puredata.info/listinfo/pd-list


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Jack
I need to find all instances that math to the first row.
It is not possible with [text search] if i am right.
++

Jack



Le 22/03/2017 à 08:27, Liam Goodacre a écrit :
> You can also use [text search], although t's not so easy to find more
> than the first instance. If you don't mind taking a extra step, you
> could give each line a third term, which is the line number. Then you
> can use the "> 3" argument for [text search] to find matches s 
> 
> 
> 
> 
> *From:* Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack
> <j...@rybn.org>
> *Sent:* 21 March 2017 18:14
> *To:* pd-list@lists.iem.at
> *Subject:* [PD] Fastest way to find lines in text file
>  
> Hello,
> 
> I need to find every lines of a textfile containing a word.
> The textfile has 2.539.592 lines.
> Now, i am using [msgfile] from zexy because i can find a line, skip a
> line and find again ... until the end of the textfile.
> But, i am wondering if there is an other object (in an other library)
> faster, specialized in this work ?
> Thanx.
> ++
> 
> Jack
> 
> 
> ___
> Pd-list@lists.iem.at mailing list
> UNSUBSCRIBE and account-management ->
> https://lists.puredata.info/listinfo/pd-list
> 
> 
> ___
> Pd-list@lists.iem.at mailing list
> UNSUBSCRIBE and account-management -> 
> https://lists.puredata.info/listinfo/pd-list
> 


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Jack
Le 21/03/2017 à 22:16, Christof Ressi a écrit :
>> I need to find every lines of a textfile containing a word.
> 
> that sentence is quite ambiguous, hehe. 

We can talk about interger.

> so we're talking about integers. I'd load the textfile with [text] and then 
> use [text sequence] to spit out all the lines and copy them to a table with 2 
> * 2.539.592 elements. looping through the table and finding your words should 
> be straightforward and much faster.

The problem with this solution (if i am right), you need to store the
second element somewhere to return the line (e.g. 345594 577427) when
the first match to your interger (i just need to test on the first element).
I test it with 345594 (5 lines match to this interger) and [realtime]
return 200 ms to execute it (with msgfile, it is 40 ms). Even, if you
don't store the second interger to return the whole line, you need 70 ms
(with [array get], [drip] from zexy and [route] or [select]).
Maybe i miss something ?
++

Jack



> 
> Christof
> 
>> Gesendet: Dienstag, 21. März 2017 um 19:20 Uhr
>> Von: Jack <j...@rybn.org>
>> An: pd-list@lists.iem.at
>> Betreff: Re: [PD] Fastest way to find lines in text file
>>
>> The textfile has only two "string" by lines.
>> Here, 20 lines of the textfile :
>>
>> 345594 577427
>> 345594 567267
>> 345594 528911
>> 345594 534435
>> 345594 523087
>> 345595 374384
>> 345595 377303
>> 345595 380544
>> 345595 379911
>> 345595 557020
>> 345595 552396
>> 345595 562487
>> 345595 460842
>> 345595 428449
>> 345595 424095
>> 345596 447676
>> 345598 579883
>> 345598 379495
>> 345598 379039
>> 345598 380328
>> ++
>>
>> Jack
>>
>>
>> Le 21/03/2017 à 19:14, Jack a écrit :
>>> Hello,
>>>
>>> I need to find every lines of a textfile containing a word.
>>> The textfile has 2.539.592 lines.
>>> Now, i am using [msgfile] from zexy because i can find a line, skip a
>>> line and find again ... until the end of the textfile.
>>> But, i am wondering if there is an other object (in an other library)
>>> faster, specialized in this work ?
>>> Thanx.
>>> ++
>>>
>>> Jack
>>>
>>>
>>> ___
>>> Pd-list@lists.iem.at mailing list
>>> UNSUBSCRIBE and account-management -> 
>>> https://lists.puredata.info/listinfo/pd-list
>>>
>>
>>
>> ___
>> Pd-list@lists.iem.at mailing list
>> UNSUBSCRIBE and account-management -> 
>> https://lists.puredata.info/listinfo/pd-list
>>


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread cyrille henry

few years ago, i used "until / textfile / select" for search and replace in 
huge text because gedit was to slow.
i'll do it again if I had to deal with this kind of texts.

for your application, you just need to add a counter.
c


Le 21/03/2017 à 19:14, Jack a écrit :

Hello,

I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
Thanx.
++

Jack


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list



___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-22 Thread Liam Goodacre
You can also use [text search], although t's not so easy to find more than the 
first instance. If you don't mind taking a extra step, you could give each line 
a third term, which is the line number. Then you can use the "> 3" argument for 
[text search] to find matches s



From: Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack <j...@rybn.org>
Sent: 21 March 2017 18:14
To: pd-list@lists.iem.at
Subject: [PD] Fastest way to find lines in text file

Hello,

I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
Thanx.
++

Jack


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list
___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-21 Thread Christof Ressi
> I need to find every lines of a textfile containing a word.

that sentence is quite ambiguous, hehe. 
so we're talking about integers. I'd load the textfile with [text] and then use 
[text sequence] to spit out all the lines and copy them to a table with 2 * 
2.539.592 elements. looping through the table and finding your words should be 
straightforward and much faster.

Christof

> Gesendet: Dienstag, 21. März 2017 um 19:20 Uhr
> Von: Jack <j...@rybn.org>
> An: pd-list@lists.iem.at
> Betreff: Re: [PD] Fastest way to find lines in text file
>
> The textfile has only two "string" by lines.
> Here, 20 lines of the textfile :
> 
> 345594 577427
> 345594 567267
> 345594 528911
> 345594 534435
> 345594 523087
> 345595 374384
> 345595 377303
> 345595 380544
> 345595 379911
> 345595 557020
> 345595 552396
> 345595 562487
> 345595 460842
> 345595 428449
> 345595 424095
> 345596 447676
> 345598 579883
> 345598 379495
> 345598 379039
> 345598 380328
> ++
> 
> Jack
> 
> 
> Le 21/03/2017 à 19:14, Jack a écrit :
> > Hello,
> > 
> > I need to find every lines of a textfile containing a word.
> > The textfile has 2.539.592 lines.
> > Now, i am using [msgfile] from zexy because i can find a line, skip a
> > line and find again ... until the end of the textfile.
> > But, i am wondering if there is an other object (in an other library)
> > faster, specialized in this work ?
> > Thanx.
> > ++
> > 
> > Jack
> > 
> > 
> > ___
> > Pd-list@lists.iem.at mailing list
> > UNSUBSCRIBE and account-management -> 
> > https://lists.puredata.info/listinfo/pd-list
> > 
> 
> 
> ___
> Pd-list@lists.iem.at mailing list
> UNSUBSCRIBE and account-management -> 
> https://lists.puredata.info/listinfo/pd-list
>

___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


Re: [PD] Fastest way to find lines in text file

2017-03-21 Thread Jack
The textfile has only two "string" by lines.
Here, 20 lines of the textfile :

345594 577427
345594 567267
345594 528911
345594 534435
345594 523087
345595 374384
345595 377303
345595 380544
345595 379911
345595 557020
345595 552396
345595 562487
345595 460842
345595 428449
345595 424095
345596 447676
345598 579883
345598 379495
345598 379039
345598 380328
++

Jack


Le 21/03/2017 à 19:14, Jack a écrit :
> Hello,
> 
> I need to find every lines of a textfile containing a word.
> The textfile has 2.539.592 lines.
> Now, i am using [msgfile] from zexy because i can find a line, skip a
> line and find again ... until the end of the textfile.
> But, i am wondering if there is an other object (in an other library)
> faster, specialized in this work ?
> Thanx.
> ++
> 
> Jack
> 
> 
> ___
> Pd-list@lists.iem.at mailing list
> UNSUBSCRIBE and account-management -> 
> https://lists.puredata.info/listinfo/pd-list
> 


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list


[PD] Fastest way to find lines in text file

2017-03-21 Thread Jack
Hello,

I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
Thanx.
++

Jack


___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management -> 
https://lists.puredata.info/listinfo/pd-list