Re: [Factor-talk] potential memory issue

2015-10-02 Thread HP wei
First,
In factor's listener terminal (not in the gui window, though),
Jon Harper suggested to hit Control-C and t to terminate
a long running code.
I hit Control-C in below case (1), it brings out a low level debugger (what
a pleasant surprise).

Let me ask a question first before I write more about investigating the
issue.
*** in the low-level debugger, one of the commands is 'data' to dump data
heap.
 Is there any way to dump the result to a file ??



Summary of further investigation.

The code
0 "a_path_to_big_folder" x [ link-info dup symbolic-link? [ drop ] [ size>>
+ ] if  ] each-file

(1) when x = t  (breadth-first  BFS)
 the memory usage reported by linux's  'top' shows steady increase
 from around 190M to as high as 2GB before either I killed it or it hit
tge
 missing file issue.

(2) when x = f  (Depth-first DFS)
 Watching RES from 'top', I noticed that
 the memory usage even drops from 190M to around 94M before I went home
 and let the code run in the office.
 The next morning, I found that it finished OK with a total-file-size
on the data stack

  But the total-file-size of about 280GB is incorrect.  It should be
around 74GB.

-

Just a reminder, our disk has the following properties.

   it is a disk with a tree of directories.
   directory count ~ 6000
   total number of files as of now ~ 1.1 million
   total number of softlinks ~ 57
   total file size ~ 70GB

   number of files in each sub-directory (not including the files in
sub-directory inside it)
   range from a few hundreds to as high as of the order of <~10K.

   (NOTE-A) Some of the folders are in fact softlinks that links to "OTHER
disk locations".



For the above disk,  DFS appears to consume much less memory !
But the resulting file size is incorrect (280GB instead of 70GB).
This is presumably due to (NOTE-A) and the code must have scanned through
those
OTHER disks.  But then the extra scanning appears to be incomplete!
Becasue 280GB is too small.  A complete traversing the above disk plus all
those
OTEHR disks will amount to a few Terabytes.

So, somewhere the traversing was screwed up. This may be another
investigation for another day.

--

For the case (1), I did Control-C to bring up the low-level debugger.
And type 'data' to look at the data heap content.
It is a LONG LONG list of stuff containing many tuples describing the
directory-entries.

I type 'c' to let the code continue for a while.
Control-C again.
then 'data' to look at the data heap.
Since the list is TOO long to fit the screen, I could not see any
significant difference
in the last few lines of the output between this 'data' and the last one.

It will be nice to be able to dump the 'data' result to a file.
Then a more comprehensive comparison can be done.

I also tried to type 'gc'  to invoke a round of garbage collecting.
But nothing seems to be affected.  The memory as monitored by 'top'
remains unchanged.



In closing,  the simple code (with DFS)
0 "a_path_to_big_folder" f [ link-info dup symbolic-link? [ drop ] [
size>> + ] if  ] each-file
could NOT achieve the intended action --- to sum up the file-size for files
residing in a
disk (as pointed to by a_path_to_big_folder).

A custom iterator needs to be coded, after all.

Finally, the memory issue in the BFS may be just due to that the algorithm
requires a LOT of
memory to store all directory-entries at a certain depth in the tree.
If we can dump the 'data' content in the debugger to a file, I could see
more clearly
by comparing the content at two distinctive moments (say when RES (from
'top')
reaches 1gb and when it reaches 2gb).

--HP
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


[Factor-talk] how to error trapping 'link-info'

2015-10-01 Thread HP wei
As suggested by John, I test out the following action to
get the total file sizes of a disk volume.

0 "a_path_to_big_folder" [ link-info dup symbolic-link? [ drop ] [ size>> +
] if  ] each-file


Our big-folder is on a netapp server shared by tens of people. Many small
files get updated
every minutes if not seconds. The update may involve removing the file
first.
It has many many subfolders which in turn have more subfolders.
Each subfolder may have hundreds of files (occasionally in the thousands).

After a few day's discussion with factor guru's, I understand that
each-file traverses the directory structure by first putting
entries of a folder in a sequence. And it processes each entry one by one.
Although this may not cause using big chunk of memory at a time,
it does have the following issue..



Last night, I left the command running and came back this morning to find
that it failed with the message.
lstat:  "a path to a file" does not exist !!!

This is because after 'each-file' puts the file into the sequence and then
when
it is its turn to be processed, it is not there at the time!!
Without error trapping, the above "0 ... each-file"  could not work in our
case.

So, I guess I would need to do error-trapping on the word link-info.
I do not know how to do it.  Any hint ?

Thanks
HP
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


[Factor-talk] Fwd: how to error trapping 'link-info'

2015-10-01 Thread HP wei
Just want to elaborate on what I meant by 'error trapping' link-info.

[ link-info dup symbolic-link? [ drop ] [ size>> + ]

In python's syntax, I would write the above quot as something like:

try:
file_info = link_info(dir_entry)
if not is_symbolic_link(file_info):
 total_size += get_fize(file_info)
except:
 pass

continue


HP




-- Forwarded message --
From: HP wei <hpwe...@gmail.com>
Date: Thu, Oct 1, 2015 at 9:36 AM
Subject: how to error trapping 'link-info'
To: factor-talk@lists.sourceforge.net


As suggested by John, I test out the following action to
get the total file sizes of a disk volume.

0 "a_path_to_big_folder" [ link-info dup symbolic-link? [ drop ] [ size>> +
] if  ] each-file


Our big-folder is on a netapp server shared by tens of people. Many small
files get updated
every minutes if not seconds. The update may involve removing the file
first.
It has many many subfolders which in turn have more subfolders.
Each subfolder may have hundreds of files (occasionally in the thousands).

After a few day's discussion with factor guru's, I understand that
each-file traverses the directory structure by first putting
entries of a folder in a sequence. And it processes each entry one by one.
Although this may not cause using big chunk of memory at a time,
it does have the following issue..



Last night, I left the command running and came back this morning to find
that it failed with the message.
lstat:  "a path to a file" does not exist !!!

This is because after 'each-file' puts the file into the sequence and then
when
it is its turn to be processed, it is not there at the time!!
Without error trapping, the above "0 ... each-file"  could not work in our
case.

So, I guess I would need to do error-trapping on the word link-info.
I do not know how to do it.  Any hint ?

Thanks
HP
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


[Factor-talk] remote listener over tcp

2015-10-01 Thread HP wei
The objective:
we have several linux machines that I need to check on certain
status (--- disks,  update-jobs etc).
My plan is, if feasible, to run a remote factor-listener on each of
those
machines and run a 'master' factor on a machine that collects all info
and presents issues on a gui-window.

On the master machine, if I find an issue that needs to be addressed
on machine-A, I will need to issue a command from the master
machine gui-window and that command shall be send to machine-A to be
executed.  [ The command may be a linux system commad
   or a custom command written by C++
   or perhaps something written by factor, if
appropriate. ]

e.g.   initially,
I will have a word called 'check-disks'.
And the master can send the word over tcp to each of those machines
and collect report.

Eventually, each machine will do the checking regularly and send
issues back to the master.



The first step appears to be learning how to set up a remote listener over
tcp
and talk to it from a master machine.

Could you please direct me to a place where someone has written something
about
how to this in factor ??   A working example will be nice.

thanks
HP
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


[Factor-talk] potential memory issue --- Fwd: how to error trapping 'link-info'

2015-10-01 Thread HP wei
Well, I just checked the running factor session that failed the
task overnight that I mentioned in below email.

>From the linux system command 'top',
I see that this particular factor is using
VIRT   4.0g
RES   2.0g
%MEM 26%

I clicked on the restart listener button and the numbers remain the same.
should I have done more to clean up the memory usage ?

--

For comparison, I killed the factor session and restart it from the shell.
The numbers are
VIRT  940M
RES  182M
%MEM 2.2%

==> Had the factor continued to run last night,
   it would have probably exhausted the memory on the machine.
   I guess there might be some memory (leak) issue somewhere ???

--HP



-- Forwarded message --
From: HP wei <hpwe...@gmail.com>
Date: Thu, Oct 1, 2015 at 9:36 AM
Subject: how to error trapping 'link-info'
To: factor-talk@lists.sourceforge.net


As suggested by John, I test out the following action to
get the total file sizes of a disk volume.

0 "a_path_to_big_folder" [ link-info dup symbolic-link? [ drop ] [ size>> +
] if  ] each-file


Our big-folder is on a netapp server shared by tens of people. Many small
files get updated
every minutes if not seconds. The update may involve removing the file
first.
It has many many subfolders which in turn have more subfolders.
Each subfolder may have hundreds of files (occasionally in the thousands).

After a few day's discussion with factor guru's, I understand that
each-file traverses the directory structure by first putting
entries of a folder in a sequence. And it processes each entry one by one.
Although this may not cause using big chunk of memory at a time,
it does have the following issue..



Last night, I left the command running and came back this morning to find
that it failed with the message.
lstat:  "a path to a file" does not exist !!!

This is because after 'each-file' puts the file into the sequence and then
when
it is its turn to be processed, it is not there at the time!!
Without error trapping, the above "0 ... each-file"  could not work in our
case.

So, I guess I would need to do error-trapping on the word link-info.
I do not know how to do it.  Any hint ?

Thanks
HP
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] potential memory issue --- Fwd: how to error trapping 'link-info'

2015-10-01 Thread HP wei
Yes, I could find out a bit more about the memory issue.

I tried it again this afternoon.  After 50 minutes into the action
  0 "path" t [ link-info ... ] each-file
the system 'top' shows RES rises above 1.2GB and %MEM becomes 15.7%
and they continue to rise.
It blacks out the gui window of factor.

I try to hit Control-C but it continues to run.
*** How to exit a running words ?

It looks like the only natural way I know of to 'stop' it is to wait for
link-info to hit the missing file scenario --- like the overnight run of
last night.

So, I just killed the factor session from the shell.  And missed the
opportunity
to inspect the memory usage in factor, as John suggested.

Is there a way to exit running words ?
[ perhaps, I need to learn to use a factor-debugger ? ]

-

Replying to John's questions about the disk layout:
   it is a disk with a tree of directories.
   directory count ~ 6000
   total number of files as of now ~ 1.1 million
   total number of softlinks ~ 57
   total file size ~ 70GB

   number of files in each sub-directory (not including the files in
sub-directory inside it)
   range from a few hundreds to as high as of the order of <~10K.

   Some of the directories are constantly updated throughout the day.

--HP



On Thu, Oct 1, 2015 at 12:27 PM, John Benediktsson <mrj...@gmail.com> wrote:

> Maybe you can debug a little if you see that happen again?
>
> Perhaps something like this to get the largest number of instances, if
> there is a per-file leak:
>
> IN: scratchpad all-instances [ class-of ] histogram-by
> sort-values reverse 10 head .
>
> Some other words for inspecting memory:
>
> http://docs.factorcode.org/content/article-tools.memory.html
>
> Can you give us some information about your disk layout?
>
> Is it one big directory with 1 million files?  Is it a tree of
> directories?  What do you think is average number of files per-directory?
>
> I opened a bug report if you'd like to provide feedback there rather than
> the mailing list:
>
> https://github.com/slavapestov/factor/issues/1483
>
>
>
>
> On Thu, Oct 1, 2015 at 8:38 AM, HP wei <hpwe...@gmail.com> wrote:
>
>> Well, I just checked the running factor session that failed the
>> task overnight that I mentioned in below email.
>>
>> From the linux system command 'top',
>> I see that this particular factor is using
>> VIRT   4.0g
>> RES   2.0g
>> %MEM 26%
>>
>> I clicked on the restart listener button and the numbers remain the same.
>> should I have done more to clean up the memory usage ?
>>
>> --
>>
>> For comparison, I killed the factor session and restart it from the shell.
>> The numbers are
>> VIRT  940M
>> RES  182M
>> %MEM 2.2%
>>
>> ==> Had the factor continued to run last night,
>>it would have probably exhausted the memory on the machine.
>>I guess there might be some memory (leak) issue somewhere ???
>>
>> --HP
>>
>>
>>
>> -- Forwarded message --
>> From: HP wei <hpwe...@gmail.com>
>> Date: Thu, Oct 1, 2015 at 9:36 AM
>> Subject: how to error trapping 'link-info'
>> To: factor-talk@lists.sourceforge.net
>>
>>
>> As suggested by John, I test out the following action to
>> get the total file sizes of a disk volume.
>>
>> 0 "a_path_to_big_folder" [ link-info dup symbolic-link? [ drop ] [ size>>
>> + ] if  ] each-file
>>
>>
>> Our big-folder is on a netapp server shared by tens of people. Many small
>> files get updated
>> every minutes if not seconds. The update may involve removing the file
>> first.
>> It has many many subfolders which in turn have more subfolders.
>> Each subfolder may have hundreds of files (occasionally in the thousands).
>>
>> After a few day's discussion with factor guru's, I understand that
>> each-file traverses the directory structure by first putting
>> entries of a folder in a sequence. And it processes each entry one by one.
>> Although this may not cause using big chunk of memory at a time,
>> it does have the following issue..
>>
>> 
>>
>> Last night, I left the command running and came back this morning to find
>> that it failed with the message.
>> lstat:  "a path to a file" does not exist !!!
>>
>> This is because after 'each-file' puts the file into the sequence and
>> then when
>> it is its turn to be processed, i

Re: [Factor-talk] A bug ?

2015-10-01 Thread HP Wei
Thanks for suggesting to look at the source of (directory-entries)

I see that the iterator over a directory is the word: with-unix-directory
and (directory-entries) uses produce to collect the entries into a sequence.

I did not find a word in sequences that is similar to produce but does a 
‘reduce’ action
— sot that I could simply replace ‘produce’ in the definition of 
(directory-entries).

So, my next thought will be to come up with a word — each-entry which emits an 
dirent
successively till the end.
Then below code can tally up the total file size more efficiently (memory-wise).

path [ 0 [ quot ] each-entry ] with-unix-directory

—hp


> On Sep 30, 2015, at 5:05 PM, John Benediktsson <mrj...@gmail.com> wrote:
> 
> I mentioned before that it's not too hard to make an iterative using dirent, 
> especially if you just call it directly yourself. You can see how it works by 
> doing:
> 
> IN: scratchpad \ (directory-entries) see
> 
> Nothing technical prevents it, only that right now the iteration is hidden 
> behind that word where it produces a list of entries. 
> 
> In normal use, where a single directory doesn't have that many entries, this 
> is not a performance issue.  But, like anything with software, if you have a 
> different use case we can adapt the code to it. 
> 
> 
> On Sep 30, 2015, at 1:59 PM, HP wei <hpwe...@gmail.com 
> <mailto:hpwe...@gmail.com>> wrote:
> 
>> I see.  That is how factor distinguishes stat and lstat :)  Thanks.
>> 
>> Now I can try out the process on a folder with many subfolders and 
>> with millions of files.  
>> [ I wish in factor, there is a facility to make an iterator type of object 
>> out of dirent. ]
>> 
>> --HP
>> 
>> 
>> On Wed, Sep 30, 2015 at 4:47 PM, Doug Coleman <doug.cole...@gmail.com 
>> <mailto:doug.cole...@gmail.com>> wrote:
>> You can do link-info instead.
>> 
>> 
>> On Wed, Sep 30, 2015, 13:42 HP wei <hpwe...@gmail.com 
>> <mailto:hpwe...@gmail.com>> wrote:
>> While trying out the word each-file,  I bumped into presumably
>> a bug in 
>> 
>> file-info ( path -- info )
>> 
>> Under linux,
>> if the path is a softlink (symbolic link),
>> 
>> path file-info symbolic-link?
>> 
>> gives 'f'    and this is wrong.
>> 
>> I looked at the implementation of file-info
>> and saw that it calls file-status which in turn calls
>> stat-func.
>> The latter calls __xstat64   this seems to be related to stat function.
>> 
>> Under linux, there are two functions that returns the stat-structure.
>> One is stat, the other lstat.
>> If a path is a symbolic link,
>> It is lstat that will return info about the link.
>> 
>> ---
>> 
>> As a result of this 'bug',  the following (as suggested by John the other 
>> day)
>> could not do what is intended (to get the size of a folder).
>> 
>> 0 a_path_to_folder t [ file-info dup symbolic-link? [ drop ] [ size>> + ] if 
>>  ] each-file 
>> 
>> 
>> --HP Wei
>> 
>> --
>> ___
>> Factor-talk mailing list
>> Factor-talk@lists.sourceforge.net <mailto:Factor-talk@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/factor-talk 
>> <https://lists.sourceforge.net/lists/listinfo/factor-talk>
>> 
>> --
>> 
>> ___
>> Factor-talk mailing list
>> Factor-talk@lists.sourceforge.net <mailto:Factor-talk@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/factor-talk 
>> <https://lists.sourceforge.net/lists/listinfo/factor-talk>
>> 
>> 
>> --
>> ___
>> Factor-talk mailing list
>> Factor-talk@lists.sourceforge.net <mailto:Factor-talk@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/factor-talk 
>> <https://lists.sourceforge.net/lists/listinfo/factor-talk>
> --
> ___
> Factor-talk mailing list
> Factor-talk@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/factor-talk

--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] A bug ?

2015-09-30 Thread HP wei
I see.  That is how factor distinguishes stat and lstat :)  Thanks.

Now I can try out the process on a folder with many subfolders and
with millions of files.
[ I wish in factor, there is a facility to make an iterator type of object
out of dirent. ]

--HP


On Wed, Sep 30, 2015 at 4:47 PM, Doug Coleman <doug.cole...@gmail.com>
wrote:

> You can do link-info instead.
>
> On Wed, Sep 30, 2015, 13:42 HP wei <hpwe...@gmail.com> wrote:
>
>> While trying out the word each-file,  I bumped into presumably
>> a bug in
>>
>> file-info ( path -- info )
>>
>> Under linux,
>> if the path is a softlink (symbolic link),
>>
>> path file-info symbolic-link?
>>
>> gives 'f'    and this is wrong.
>>
>> I looked at the implementation of file-info
>> and saw that it calls file-status which in turn calls
>> stat-func.
>> The latter calls __xstat64   this seems to be related to stat
>> function.
>>
>> Under linux, there are two functions that returns the stat-structure.
>> One is stat, the other lstat.
>> If a path is a symbolic link,
>> It is lstat that will return info about the link.
>>
>> ---
>>
>> As a result of this 'bug',  the following (as suggested by John the other
>> day)
>> could not do what is intended (to get the size of a folder).
>>
>> 0 a_path_to_folder t [ file-info dup symbolic-link? [ drop ] [ size>> + ]
>> if  ] each-file
>>
>>
>> --HP Wei
>>
>>
>> --
>> ___
>> Factor-talk mailing list
>> Factor-talk@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/factor-talk
>>
>
>
> --
>
> ___
> Factor-talk mailing list
> Factor-talk@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/factor-talk
>
>
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


[Factor-talk] to get total file size for millions of files

2015-09-28 Thread HP wei
In our environment, we sometime have a folder with
as many as a couple million files in it.

(1) the word: each-file ( path bfs? quot -- )
 Does it handle a file successively by quot without first gathering all
 the file-paths in the path ?

(2) what is the idiomatic way to get the total file size for all files
 in a folder (and its sub-folders) ?
 Using each-file in (1), I am forced to set up a global 'variable'
called
 total-size.

 --

 If I would to process a big file to collect some info, I could write:

 "path-to-file"  ascii  [  V{ }  [ quot ]  each-line  ]
with-file-reader

 The collected info is on the stack after the above finishes.

 

 To go through a huge directory (folder),
 do you know if the current factor can set up something similar ?


Thanks
HP Wei
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] how to run-pipeline

2015-09-22 Thread HP Wei
Thanks, Alex, for pointing out the the ‘make’ word, which is something new to 
me.
I will study the example usages that you listed.



I realize that I did not make my original intention clear enough.
Here is what I want to do in factor:

(1) invoke custom command in shell:  cmd1 -a A -b B
 The result is printed out to stdout.
(2) Take the output of (1) and select (or manipulate) the lines
  And print them to stdout
(3) invoke another custom command: cmd2 -c C…

So, in factor, from what I leaned so far, to accomplish the above I can do 

{ “cmd1 -a A -b B” [ quot ] “cmd2 -c C” } run-pipeline

My original issue was to construct the string “cmd1 -a A -b B” and “cmd2 -c C”
in a flexible way so that I can choose to supply (or not supply) those argument 
A, B, C.
And find a clean way to put everything together into { … } for run-pipeline.

By the way,
Your suggestion in another email of implementing >process looks interesting to 
explore.
Any example usage in this case ?

Thanks
HP
 

> On Sep 22, 2015, at 12:23 PM, Alex Vondrak <ajvond...@gmail.com> wrote:
> 
> Ultimately, I may also insert some factor quot in betweeen
> str1 and str2 to do some processing before handing the
> result to cmd2.
> 
> Do you mean you want to take the output of running cmd1, manipulate it, then 
> pass *that* to cmd2? Because that sounds rather different from what your 
> example code looks like it's actually trying to do.
> 
> It seems like your example is trying to construct launch descriptors 
> independently, then pass those entire results to run-pipeline at once. Which 
> is altogether easier: if I understand right, you're basically there already, 
> but your main concern is more about how to build the array in a prettier way? 
> If that's it, I suggest the `make` vocabulary: 
> http://docs.factorcode.org/content/article-namespaces-make.html 
> <http://docs.factorcode.org/content/article-namespaces-make.html>
> 
> Some examples of `make` usage in the wild:
> https://github.com/slavapestov/factor/blob/master/basis/io/backend/unix/unix-tests.factor#L142-L147
>  
> <https://github.com/slavapestov/factor/blob/master/basis/io/backend/unix/unix-tests.factor#L142-L147>
> https://github.com/slavapestov/factor/blob/master/basis/bootstrap/image/upload/upload.factor#L47-L51
>  
> <https://github.com/slavapestov/factor/blob/master/basis/bootstrap/image/upload/upload.factor#L47-L51>
> https://github.com/slavapestov/factor/blob/master/extra/graphviz/render/render.factor#L62-L67
>  
> <https://github.com/slavapestov/factor/blob/master/extra/graphviz/render/render.factor#L62-L67>
> 
> Granted, all of those are building a single process, not a pipeline. But the 
> same principles apply:
> 
> : cmd1 ( -- ) ... ;
> : cmd2 ( -- ) ... ;
> 
> [ cmd1 , cmd2 , ] { } make run-pipeline
> 
> On Mon, Sep 21, 2015 at 9:21 PM, HP Wei <hpwe...@gmail.com 
> <mailto:hpwe...@gmail.com>> wrote:
> I want to run binary codes (C++) under linux using run-pipeline
> 
> In linux shell, the task is
> 
> cmd1 -a arg1 -b arg2 | cmd2 -c arg3
> 
> I know in general, in factor, I need to construct
> 
> { str1  str2 } run-pipeline
> where str1 = “cmd1 -a arg1 -b arg2”
>str2 = “cmd2 -c arg3”
> Ultimately, I may also insert some factor quot in betweeen
> str1 and str2 to do some processing before handing the
> result to cmd2.
> 
> 
> Here is what I envision:
> 
> TUPLE: cmd1 a b ;
> 
> :  ( — cmd1 )
> cmd1 new
> “default a” >>a
> “default b” >>b ;
> 
> : get-cmd1 ( cmd1 — str1 )
>[ a>> ] [ b>> ] bi
>“cmd1 -a %s -b %s” sprintf  ;
> 
> so now, I can write
> 
> 
>my_b >>b
> get-cmd1
> 
> — similarly for cmd2.
> 
> But I bump into a mental block when trying to
> put things together for run-pipeline
> 
> If there were just one cmd1 (without cmd2),
> I thought I could do
> 
> ${  my_b >>b get-cmd1 } run-pipeline
> 
> Adding cmd2, I could write
> 
> ${  my_b >>b get-cmd1   my_c >>c get-cmd2 } run-pipeline
> 
> But this looks ugly.
> Is there a simpler way ?
> 
> Thanks
> HP Wei
> 
> 
> 
> --
> ___
> Factor-talk mailing list
> Factor-talk@lists.sourceforge.net <mailto:Factor-talk@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/factor-talk 
> <https://lists.sourceforge.net/lists/listinfo/factor-talk>
> 
> --
> ___
> Factor-talk mailing list
> Factor-talk@li

[Factor-talk] how to run-pipeline

2015-09-21 Thread HP Wei
I want to run binary codes (C++) under linux using run-pipeline

In linux shell, the task is 

cmd1 -a arg1 -b arg2 | cmd2 -c arg3

I know in general, in factor, I need to construct

{ str1  str2 } run-pipeline
where str1 = “cmd1 -a arg1 -b arg2”
   str2 = “cmd2 -c arg3”
Ultimately, I may also insert some factor quot in betweeen
str1 and str2 to do some processing before handing the 
result to cmd2.


Here is what I envision:

TUPLE: cmd1 a b ;

:  ( — cmd1 )
cmd1 new
“default a” >>a
“default b” >>b ;

: get-cmd1 ( cmd1 — str1 )
   [ a>> ] [ b>> ] bi 
   “cmd1 -a %s -b %s” sprintf  ;

so now, I can write

 
   my_b >>b
get-cmd1

— similarly for cmd2.

But I bump into a mental block when trying to 
put things together for run-pipeline

If there were just one cmd1 (without cmd2),
I thought I could do

${  my_b >>b get-cmd1 } run-pipeline

Adding cmd2, I could write

${  my_b >>b get-cmd1   my_c >>c get-cmd2 } run-pipeline

But this looks ugly.  
Is there a simpler way ?

Thanks
HP Wei



--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


[Factor-talk] newbie question: how to collect info from a stream

2015-09-01 Thread HP Wei
I am just starting to learn factor.

In ocaml or python, when I open a file stream, I usually set up an object
with an accumulator class variable where I collect the selected info while 
walking through the stream (file).

I am trying to look at various places to find an equivalent way of doing this
[ or the natural way (folding?) of achieving the goal ] in factor.
Could you please help me out ?

thanks
HP
 

--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk