Thanks Nick for your reply and taking time on this. One quick question before 
you lost on below email. In release 0.6.1 we have fix for below bug right?

> BasicFlexGroup0_io_workload/pm_io/.lucyindex/1 :  input 47 too high 
> S_fibonacci at core/Lucy/Index/IndexManager.c line 129

Thanks,
Rajiv g

-----Original Message-----
From: Nick Wellnhofer [mailto:[email protected]] 
Sent: Saturday, December 17, 2016 2:52 AM
To: [email protected]
Subject: Re: [lucy-user] LUCY_Folder_Open_Out_IMP at core/Lucy/Store/Folder.c 
line 119

On 13/12/2016 18:05, Gupta, Rajiv wrote:
> After I create directory by myself I'm getting this error:

Which directory do you try to create? I wouldn't try to make manual changes 
inside Lucy's index directory. This will only make things worse.

        $indexer = Lucy::Index::Indexer->new(
                index    => $saveindexlocation,
                schema   => $schema,
                manager  => 
Lucy::Index::IndexManager->new(host=>$self->{_hostname}),
                create   => $dir_create_flag,
                truncate => 0,
            );

The "create" flag initially set to 1 so that $saveindexlocation can get created 
after I got the error I make sure directory is created and made create flag 
always 0.

> Can't open 
> '/u/smoke/presub/logs/cit-fg-adr-neg-rtp.rajivg.1481473130.41339_cmode_1of1/.lucyindex/1/seg_fd/lexicon-7.ixix':
>  Invalid argument
> 20161211 182109 [] *    LUCY_FSFolder_Local_Open_FileHandle_IMP at 
> core/Lucy/Store/FSFolder.c line 118
> 20161211 182109 [] *    LUCY_Folder_Local_Open_In_IMP at 
> core/Lucy/Store/Folder.c line 101
> 20161211 182109 [] *    LUCY_Folder_Open_In_IMP at core/Lucy/Store/Folder.c 
> line 75
>
> There are two more failures they also failed due so similar reasons
>
> rename from 
> '/u/smoke/presub/logs/cit-fg-adr-ndo-rtp.rajivg.1481473039.49384_cmode
> _1of1/.lucyindex/1/schema.temp' to 
> '/u/smoke/presub/logs/cit-fg-adr-ndo-rtp.rajivg.1481473039.49384_cmode
> _1of1/.lucyindex/1/schema_e4.json' failed: No such file or directory
>
> Can't delete 'lexicon-3.ix'
>
> I believe all three are related to race condition while doing parallel 
> indexing and should go away with retries. However my retries started failing 
> with different error which is strange to me as if directory already exists 
> shouldn't it skip from create attempt.
>
> 20161211 182109 [] * FAIL: [FAILED]: Retrying to add doc at path: 
> /u/smoke/presub/logs/cit-fg-adr-neg-rtp.rajivg.1481473130.41339_cmode_1of1/.lucyindex/1
>  :  Couldn't create directory 
> '/u/smoke/presub/logs/cit-fg-adr-neg-rtp.rajivg.1481473130.41339_cmode_1of1/.lucyindex/1':
>  No such file or directory
> 20161211 182109 [] *    LUCY_FSFolder_Initialize_IMP at 
> core/Lucy/Store/FSFolder.c line 102
>
> So my all retry attempts were also failed.

These errors still look like multiple processes are modifying the index at the 
same time. Are you really sure that every indexer is created with an 
IndexManager and that every IndexManager is created with a `host` argument that 
is unique to each machine?

Rajiv>>>All parallel processes are child process of one process and running 
from the same host. Would you think giving host name uniqueness with some 
random number would help for multiple processes. 

All these errors mean that there's something fundamentally wrong with your code 
or that you hit a bug in Lucy. The only type of error where it makes sense to 
retry is LockErr. All other errors are mostly fatal and could result in index 
corruption. Retrying will only mask an underlying problem in this case.

Unfortunately, I'm unable to help unless you provide some kind of 
self-contained, reproducible test case. I'm aware that this isn't easy, 
especially with multiple clients writing to a shared volume.

As I already hinted at, you might want to reconsider your architecture and use 
some kind of search server that uses an index on a local filesystem. There are 
ready-made platforms on top of Lucy like Dezi, but it isn't too hard to roll 
your own solution. This should result in better performance and makes behavior 
of your code more predictable.

Rajiv>>> Going to local file system is not possible for my case. This is a test 
framework that generate lot of logs and I'm doing indexing per test runs and 
all these logs needs to be on shared volume for other triaging purpose. The 
next thing I'm going to try is create a watcher per directory and index all 
files under that directory serially. Currently I'm creating watchers on all the 
files and some time multiple files in the same directory may try to get indexed 
at the same time.  And as you stated this might be the issue. I'm not sure how 
it will perform with the current time limits. Creating Indexer manager adding 
overhead to the search process. 

Nick

Reply via email to