Can you send result of ls -lR <DataDir>/<DBName> ?

Gregory Kozlovsky wrote:

> Hello, Alexander,
>
> I tried twice to clean the database and start indexing anew. The same
> result.
> When I comment out the Converter statements
>
>     Converter application/pdf text/plain /usr/bin/pdftotext -q $in $out
>     Converter application/postscript text/plain /usr/local/bin/pstotext
>
> the indexing goes all right.
>
> When indexing with the converters, the abnormally large files are
>  657M    00w
>  544M    01w
>  630M    02w
>  623M    03w
>  632M    04w
>  641M    05w
>  659M    06w
>  651M    07w
>  595M    08w
>  637M    09w
>  653M    10w
>  590M    11w
>  608M    12w
>  657M    13w
>  621M    14w
>  642M    15w
>  327M    16w
>
> Now I am trying to find out if the problem lies with .pdf or .ps files.
>
> What is the format of the files in 00w-99w directories? Is it described
> somewhere?
>
>     Regards,
>
>         Gregory
>
> -----Original Message-----
> From: Alexander F Avdonkin [mailto:[EMAIL PROTECTED]]
> Sent: Samstag, 6. Juli 2002 11:47
> To: [EMAIL PROTECTED]
> Subject: Re: [aseek-users]
>
> Possibly it could happen due to corrupted delta files. See which files
> occupies
> the most of space inside those directories.
> The only solution here is to reindex everything from clear DB.
>
> Alexander.
>
> Gregory Kozlovsky wrote:
>
> > Hello, ASPseekers,
> >
> > I install aspseek-1.2.9 and started indexing into an empty database.
> > However,
> > the indexing stopped when 95390 docs were indexed and 352506 were found
> > and not indexed. The reason is that the /var/aspseek/dbname became huge
> and
> > filled all the available space. With the old version, this directory had
> 5.2
> > G for
> > about 2 million indexed docs, now it is 14 G. Here is the output of "du *"
> > inside the
> > directory:
> >
> > [root@isn-search]# du *
> > 657M    00w
> > 544M    01w
> > 630M    02w
> > 623M    03w
> > 632M    04w
> > 641M    05w
> > 659M    06w
> > 651M    07w
> > 595M    08w
> > 637M    09w
> > 653M    10w
> > 590M    11w
> > 608M    12w
> > 657M    13w
> > 621M    14w
> > 642M    15w
> > 327M    16w
> > 39M     17w
> > 41M     18w
> > 43M     19w
> > 43M     20w
> > 37M     21w
> >
> > The rest of the subdirectories are normal size, around 50M. What is going
> > wrong? One more thing that is suspicious is that I started indexing .pdf
> > and .ps documents. May be the converters give some junk words? What
> > converters do you people use?
> >
> >         Gregory Kozlovsky
> >
> > Project Manager for Information Systems                 Tel: +41 (0)1 632
> 63
> > 70
> > International Relations and Security Network (ISN)      Fax: +41 (0)1 632
> 14
> > 13
> > Center for Security Studies and Conflict Research       Email:
> > [EMAIL PROTECTED]
> > Swiss Federal Institute of Technology (ETH)             http://www.isn.ch
> > Leonhardshalde 21, ETH-Zentrum / LEH
> > CH-8092 Z�rich, Switzerland

Reply via email to