Re: Big size xml file indexing

aslam bari Sun, 21 Jan 2007 20:50:37 -0800

Hi Saikrishna,
Thanks for reply,
But i don't know how i can go with this. Here is my code sample, let me know 
where to change.


SAXBuilder builder = new SAXBuilder();

//CONTENT here is bytearrayinputstream , i know i can give here file url also. 
Let me know whta is best.
Document doc = builder.build(CONTENT);

loop(---)
{
    doc.selectNodes(xpathquery);
}

Thanks...
----- Original Message ----
From: saikrishna venkata pendyala <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, 22 January, 2007 10:07:27 AM
Subject: Re: Big size xml file indexing


Hai ,
       I have indexed 6.2 gb xml file using lucene. What I did was
        1 .  I have splitted the 6.2gb file into small files each of size
10mb.
        2 .  And then I worte a python script to quantize number
no.ofdocuments in each file.

        Structure of my xml file is """
       <document>
        -----
        -----
        </document>
        <document>
        -----
        -----
        </document> """

Since you cannot go beyond 500MB this technique might help you of course if
file sturcture is the same.

On 1/22/07, aslam bari <[EMAIL PROTECTED]> wrote:
>
> Dear all,
> I m using lucene to index xml files. For parsing i m using JDOM to get
> XPATH nodes and do some manipulation on them and indexed them. All things
> work well but when the file size is very big about 35 - 50 MB. Then it goes
> out of memory or take a lot of time. How can i set some parameters to speed
> up and took less memory to parse the file. The problem is that i cannot
> increase much high Heap Size. So i have to limit to use heap size of 300 -
> 500 MB. Has anybody some solution for this.
>
> Thanks...
>
>
>
> __________________________________________________________
> Yahoo! India Answers: Share what you know. Learn something new
> http://in.answers.yahoo.com/
>


                
__________________________________________________________
Yahoo! India Answers: Share what you know. Learn something new
http://in.answers.yahoo.com/

Re: Big size xml file indexing

Reply via email to