Re: [Factor-talk] potential memory issue --- Fwd: how to error trapping 'link-info'

2015-10-01 Thread John Benediktsson
Maybe you can debug a little if you see that happen again?

Perhaps something like this to get the largest number of instances, if
there is a per-file leak:

IN: scratchpad all-instances [ class-of ] histogram-by
sort-values reverse 10 head .

Some other words for inspecting memory:

http://docs.factorcode.org/content/article-tools.memory.html

Can you give us some information about your disk layout?

Is it one big directory with 1 million files?  Is it a tree of
directories?  What do you think is average number of files per-directory?

I opened a bug report if you'd like to provide feedback there rather than
the mailing list:

https://github.com/slavapestov/factor/issues/1483




On Thu, Oct 1, 2015 at 8:38 AM, HP wei  wrote:

> Well, I just checked the running factor session that failed the
> task overnight that I mentioned in below email.
>
> From the linux system command 'top',
> I see that this particular factor is using
> VIRT   4.0g
> RES   2.0g
> %MEM 26%
>
> I clicked on the restart listener button and the numbers remain the same.
> should I have done more to clean up the memory usage ?
>
> --
>
> For comparison, I killed the factor session and restart it from the shell.
> The numbers are
> VIRT  940M
> RES  182M
> %MEM 2.2%
>
> ==> Had the factor continued to run last night,
>it would have probably exhausted the memory on the machine.
>I guess there might be some memory (leak) issue somewhere ???
>
> --HP
>
>
>
> -- Forwarded message --
> From: HP wei 
> Date: Thu, Oct 1, 2015 at 9:36 AM
> Subject: how to error trapping 'link-info'
> To: factor-talk@lists.sourceforge.net
>
>
> As suggested by John, I test out the following action to
> get the total file sizes of a disk volume.
>
> 0 "a_path_to_big_folder" [ link-info dup symbolic-link? [ drop ] [ size>>
> + ] if  ] each-file
>
>
> Our big-folder is on a netapp server shared by tens of people. Many small
> files get updated
> every minutes if not seconds. The update may involve removing the file
> first.
> It has many many subfolders which in turn have more subfolders.
> Each subfolder may have hundreds of files (occasionally in the thousands).
>
> After a few day's discussion with factor guru's, I understand that
> each-file traverses the directory structure by first putting
> entries of a folder in a sequence. And it processes each entry one by one.
> Although this may not cause using big chunk of memory at a time,
> it does have the following issue..
>
> 
>
> Last night, I left the command running and came back this morning to find
> that it failed with the message.
> lstat:  "a path to a file" does not exist !!!
>
> This is because after 'each-file' puts the file into the sequence and then
> when
> it is its turn to be processed, it is not there at the time!!
> Without error trapping, the above "0 ... each-file"  could not work in our
> case.
>
> So, I guess I would need to do error-trapping on the word link-info.
> I do not know how to do it.  Any hint ?
>
> Thanks
> HP
>
>
>
>
> --
>
> ___
> Factor-talk mailing list
> Factor-talk@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/factor-talk
>
>
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] potential memory issue --- Fwd: how to error trapping 'link-info'

2015-10-01 Thread HP wei
Yes, I could find out a bit more about the memory issue.

I tried it again this afternoon.  After 50 minutes into the action
  0 "path" t [ link-info ... ] each-file
the system 'top' shows RES rises above 1.2GB and %MEM becomes 15.7%
and they continue to rise.
It blacks out the gui window of factor.

I try to hit Control-C but it continues to run.
*** How to exit a running words ?

It looks like the only natural way I know of to 'stop' it is to wait for
link-info to hit the missing file scenario --- like the overnight run of
last night.

So, I just killed the factor session from the shell.  And missed the
opportunity
to inspect the memory usage in factor, as John suggested.

Is there a way to exit running words ?
[ perhaps, I need to learn to use a factor-debugger ? ]

-

Replying to John's questions about the disk layout:
   it is a disk with a tree of directories.
   directory count ~ 6000
   total number of files as of now ~ 1.1 million
   total number of softlinks ~ 57
   total file size ~ 70GB

   number of files in each sub-directory (not including the files in
sub-directory inside it)
   range from a few hundreds to as high as of the order of <~10K.

   Some of the directories are constantly updated throughout the day.

--HP



On Thu, Oct 1, 2015 at 12:27 PM, John Benediktsson  wrote:

> Maybe you can debug a little if you see that happen again?
>
> Perhaps something like this to get the largest number of instances, if
> there is a per-file leak:
>
> IN: scratchpad all-instances [ class-of ] histogram-by
> sort-values reverse 10 head .
>
> Some other words for inspecting memory:
>
> http://docs.factorcode.org/content/article-tools.memory.html
>
> Can you give us some information about your disk layout?
>
> Is it one big directory with 1 million files?  Is it a tree of
> directories?  What do you think is average number of files per-directory?
>
> I opened a bug report if you'd like to provide feedback there rather than
> the mailing list:
>
> https://github.com/slavapestov/factor/issues/1483
>
>
>
>
> On Thu, Oct 1, 2015 at 8:38 AM, HP wei  wrote:
>
>> Well, I just checked the running factor session that failed the
>> task overnight that I mentioned in below email.
>>
>> From the linux system command 'top',
>> I see that this particular factor is using
>> VIRT   4.0g
>> RES   2.0g
>> %MEM 26%
>>
>> I clicked on the restart listener button and the numbers remain the same.
>> should I have done more to clean up the memory usage ?
>>
>> --
>>
>> For comparison, I killed the factor session and restart it from the shell.
>> The numbers are
>> VIRT  940M
>> RES  182M
>> %MEM 2.2%
>>
>> ==> Had the factor continued to run last night,
>>it would have probably exhausted the memory on the machine.
>>I guess there might be some memory (leak) issue somewhere ???
>>
>> --HP
>>
>>
>>
>> -- Forwarded message --
>> From: HP wei 
>> Date: Thu, Oct 1, 2015 at 9:36 AM
>> Subject: how to error trapping 'link-info'
>> To: factor-talk@lists.sourceforge.net
>>
>>
>> As suggested by John, I test out the following action to
>> get the total file sizes of a disk volume.
>>
>> 0 "a_path_to_big_folder" [ link-info dup symbolic-link? [ drop ] [ size>>
>> + ] if  ] each-file
>>
>>
>> Our big-folder is on a netapp server shared by tens of people. Many small
>> files get updated
>> every minutes if not seconds. The update may involve removing the file
>> first.
>> It has many many subfolders which in turn have more subfolders.
>> Each subfolder may have hundreds of files (occasionally in the thousands).
>>
>> After a few day's discussion with factor guru's, I understand that
>> each-file traverses the directory structure by first putting
>> entries of a folder in a sequence. And it processes each entry one by one.
>> Although this may not cause using big chunk of memory at a time,
>> it does have the following issue..
>>
>> 
>>
>> Last night, I left the command running and came back this morning to find
>> that it failed with the message.
>> lstat:  "a path to a file" does not exist !!!
>>
>> This is because after 'each-file' puts the file into the sequence and
>> then when
>> it is its turn to be processed, it is not there at the time!!
>> Without error trapping, the above "0 ... each-file"  could not work in
>> our case.
>>
>> So, I guess I would need to do error-trapping on the word link-info.
>> I do not know how to do it.  Any hint ?
>>
>> Thanks
>> HP
>>
>>
>>
>>
>> --
>>
>> ___
>> Factor-talk mailing list
>> Factor-talk@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/factor-talk
>>
>>
>
>
> --
>
> 

Re: [Factor-talk] potential memory issue --- Fwd: how to error trapping 'link-info'

2015-10-01 Thread Doug Coleman
You can run your code in the leaks combinator and it will show you what
leaked. I suspect that you're just using a lot of memory though.

[ { 1 2 3 } [ malloc drop ] each ] leaks members .
{ ~malloc-ptr~ ~malloc-ptr~ ~malloc-ptr~ }

On Thu, Oct 1, 2015 at 12:31 PM, HP wei  wrote:

> Yes, I could find out a bit more about the memory issue.
>
> I tried it again this afternoon.  After 50 minutes into the action
>   0 "path" t [ link-info ... ] each-file
> the system 'top' shows RES rises above 1.2GB and %MEM becomes 15.7%
> and they continue to rise.
> It blacks out the gui window of factor.
>
> I try to hit Control-C but it continues to run.
> *** How to exit a running words ?
>
> It looks like the only natural way I know of to 'stop' it is to wait for
> link-info to hit the missing file scenario --- like the overnight run of
> last night.
>
> So, I just killed the factor session from the shell.  And missed the
> opportunity
> to inspect the memory usage in factor, as John suggested.
>
> Is there a way to exit running words ?
> [ perhaps, I need to learn to use a factor-debugger ? ]
>
> -
>
> Replying to John's questions about the disk layout:
>it is a disk with a tree of directories.
>directory count ~ 6000
>total number of files as of now ~ 1.1 million
>total number of softlinks ~ 57
>total file size ~ 70GB
>
>number of files in each sub-directory (not including the files in
> sub-directory inside it)
>range from a few hundreds to as high as of the order of <~10K.
>
>Some of the directories are constantly updated throughout the day.
>
> --HP
>
>
>
> On Thu, Oct 1, 2015 at 12:27 PM, John Benediktsson 
> wrote:
>
>> Maybe you can debug a little if you see that happen again?
>>
>> Perhaps something like this to get the largest number of instances, if
>> there is a per-file leak:
>>
>> IN: scratchpad all-instances [ class-of ] histogram-by
>> sort-values reverse 10 head .
>>
>> Some other words for inspecting memory:
>>
>> http://docs.factorcode.org/content/article-tools.memory.html
>>
>> Can you give us some information about your disk layout?
>>
>> Is it one big directory with 1 million files?  Is it a tree of
>> directories?  What do you think is average number of files per-directory?
>>
>> I opened a bug report if you'd like to provide feedback there rather than
>> the mailing list:
>>
>> https://github.com/slavapestov/factor/issues/1483
>>
>>
>>
>>
>> On Thu, Oct 1, 2015 at 8:38 AM, HP wei  wrote:
>>
>>> Well, I just checked the running factor session that failed the
>>> task overnight that I mentioned in below email.
>>>
>>> From the linux system command 'top',
>>> I see that this particular factor is using
>>> VIRT   4.0g
>>> RES   2.0g
>>> %MEM 26%
>>>
>>> I clicked on the restart listener button and the numbers remain the same.
>>> should I have done more to clean up the memory usage ?
>>>
>>> --
>>>
>>> For comparison, I killed the factor session and restart it from the
>>> shell.
>>> The numbers are
>>> VIRT  940M
>>> RES  182M
>>> %MEM 2.2%
>>>
>>> ==> Had the factor continued to run last night,
>>>it would have probably exhausted the memory on the machine.
>>>I guess there might be some memory (leak) issue somewhere ???
>>>
>>> --HP
>>>
>>>
>>>
>>> -- Forwarded message --
>>> From: HP wei 
>>> Date: Thu, Oct 1, 2015 at 9:36 AM
>>> Subject: how to error trapping 'link-info'
>>> To: factor-talk@lists.sourceforge.net
>>>
>>>
>>> As suggested by John, I test out the following action to
>>> get the total file sizes of a disk volume.
>>>
>>> 0 "a_path_to_big_folder" [ link-info dup symbolic-link? [ drop ] [
>>> size>> + ] if  ] each-file
>>>
>>>
>>> Our big-folder is on a netapp server shared by tens of people. Many
>>> small files get updated
>>> every minutes if not seconds. The update may involve removing the file
>>> first.
>>> It has many many subfolders which in turn have more subfolders.
>>> Each subfolder may have hundreds of files (occasionally in the
>>> thousands).
>>>
>>> After a few day's discussion with factor guru's, I understand that
>>> each-file traverses the directory structure by first putting
>>> entries of a folder in a sequence. And it processes each entry one by
>>> one.
>>> Although this may not cause using big chunk of memory at a time,
>>> it does have the following issue..
>>>
>>> 
>>>
>>> Last night, I left the command running and came back this morning to find
>>> that it failed with the message.
>>> lstat:  "a path to a file" does not exist !!!
>>>
>>> This is because after 'each-file' puts the file into the sequence and
>>> then when
>>> it is its turn to be processed, it is not there at the time!!
>>> Without error trapping, the above "0 ... each-file"  could not work in
>>> our case.
>>>
>>> So, I guess I would