Hi
Space: 700Mb vs 4.5Gb sounds way too big a difference. Are you sure you aren't loading multiple copies of the data or something like that? Queries: a 20 times slowdown for a multi field query also sounds way too big. What do the simple and multi field queries look like? -- Ian. On Wed, Jan 21, 2009 at 1:39 PM, Anshul jain <anshul.j...@epfl.ch> wrote: > Hi, > > I've indexed around half a million XML documents. Here is the document > sample: > > <a:attribute> > > <a:name>cogito:Name</a:name> > > <a:value>Alexander the Great</a:value> > > </a:attribute> > > > > <a:attribute> > > <a:name>cogito:domain</a:name> > > <a:value>ancient history</a:value> > > </a:attribute> > > > > <a:attribute> > > <a:name>cogito:first_sentence</a:name> > > <a:value> > > Alexander the Great (Greek: or Megas Alexandros; July 20 356 BC June 10 323 > BC), also known as Alexander III, was an ancient Greek king (basileus) of > Macedon (336-323 BC). > > </a:value> > > </a:attribute> > > > Average size of documents is around 4KB. > > There are a few performance issues I need help with. When I index documents, > in a structured manner, using field information like: > name: alexander the great > domain: ancient history > first_sentence: Alexander the Great (Greek: or Megas Alexandros; July 20 356 > BC June 10 323 BC), also known as Alexander III, was an ancient Greek king > (basileus) of Macedon (336-323 BC). > bagOfWords: alexander the great ancient history Alexander the Great (Greek: > or Megas Alexandros; July 20 356 BC June 10 323 BC), also known as Alexander > III, was an ancient Greek king (basileus) of Macedon (336-323 BC). > > bagOfWords is the field with all the text appended to it. > > I get the index size of 4.5 GB, but if I just append the text and store in > one field > like: > value: alexander the great ancient history Alexander the Great (Greek: or > Megas Alexandros; July 20 356 BC June 10 323 BC), also known as Alexander > III, was an ancient Greek king (basileus) of Macedon (336-323 BC). > > the index size is only 700 MB.. why is this happening? > > > > Also the query execution time of MultiFieldQueries is very slow, it is 20 > times slower than single field query. Is it normal, what could be the > reason for that? > > Thanks, > Cheers, > Anshul > > -- > Anshul Jain > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org