Re: Lucene 2.0.1 release date

2006-10-15 Thread Raghavendra Prabhu

I would very much like to see the .NET port in line with lucene java
This would result in index compatibility and equivalent features as that
lucene provides

George - Cheers for the continuous effort to keep lucene.net in line with
Lucene

Regards,
Prabhu




On 10/14/06, Otis Gospodnetic [EMAIL PROTECTED] wrote:


I'd have to check CHANGES.txt, but I don't think that many bugs have been
fixed and not that many new features added that anyone is itching for a new
release.

Otis

- Original Message 
oFrom: George Aroush [EMAIL PROTECTED]
To: java-dev@lucene.apache.org; java-user@lucene.apache.org
Sent: Saturday, October 14, 2006 10:32:47 AM
Subject: RE: Lucene 2.0.1 release date

Hi folks,

Sorry for reposting this question (see original email below) and this time
to both mailing list.

If anyone can tell me what is the plan for Lucene 2.0.1 release, I would
appreciate it very much.

As some of you may know, I am the porter of Lucene to Lucene.Net knowing
when 2.0.1 will be released will help me plan things out.

Regards,

-- George Aroush


-Original Message-
From: George Aroush [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 12, 2006 12:07 AM
To: java-dev@lucene.apache.org
Subject: Lucene 2.0.1 release date

Hi folks,

What's the plan for Lucene 2.0.1 release date?

Thanks!

-- George Aroush


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: Searching API: QueryParser vs Programatic queries

2006-05-22 Thread Raghavendra Prabhu

If i understand correctly, is it that you dont want to make use of query
parse?

You need to parse a query string without using query parser and construct
the query and still want an analyzer applied on the outcome search.


On 5/22/0 p6, Irving, Dave [EMAIL PROTECTED] wrote:


Hi Otis,

Thanks for your reply.
Yeah, Im aware of PerFieldAnalyserWrapper - and I think it could help in
the solution - but not on its own.
Here's what I mean:

When we build a document Field, we suppy either a String or a Reader.
The framework takes care of running the contents through an Analyser
(per field or otherwise) when we add the document to an index.

However, on the searching side of things, we don't have similar
functionality unless we use the QueryParser.
If we build queries programatically, then we have to make sure (by hand)
that we run search terms through the appropriate analyser whilst
constructing the query.

Its in this area that I wonder whether additional utility classes could
make programatic construction of queries somewhat easier.

Dave

 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
 Sent: 22 May 2006 15:59
 To: java-user@lucene.apache.org
 Subject: Re: Searching API: QueryParser vs Programatic queries

 Dave,
 You said you are new to Lucene and you didn't mention this
 class explicitly, so you may not be aware of it yet:
 PerFieldAnalyzerWrapper.
 It sounds like this may be what you are after.

 Otis

 - Original Message 
 From: Irving, Dave [EMAIL PROTECTED]
 To: java-user@lucene.apache.org
 Sent: Monday, May 22, 2006 5:15:23 AM
 Subject: Searching API: QueryParser vs Programatic queries

 Hi,

 Im very new to Lucene - so sorry if my question seems pretty dumb.

 In the application Im writing, I've been struggling with
 myself over whether I should be building up queries
 programatically, or using the Query Parser.

 My searchable fields are driven by meta-data, and I only want
 to support a few query types. It seems cleaner to build the
 queries up programatically rather than converting the query
 to a string and throwing it through the QueryParser.

 However, then we hit the problem that the QueryParser takes
 care of Analysing the search strings - so to do this we'd
 have to write some utility stuff to perform the analysis as
 we're building up the queries / terms.

 And then I think might as well just use the QueryParser!.

 So here's what Im wondering (which probably sounds very dumb
 to experienced Lucene'rs):

 - Is there maybe some room for more utility classes in Lucene
 which make this easier? E.g: When building up a document, we
 don't have to worry about running content through an analyser
 - but unless we use QueryParser, there doesn't seem to be
 corresponding behaviour on the search side.
 - So, Im thinking some kind of factory / builder or
 something, where you can register an Analyser (possibly a per
 field wrapper), and then it is applied per field as the query
 is being built up programatically.

 Maybe this is just an extraction refactoring to take this
 behaviour out of QueryParser (which could delegate to it).

 The result could be that more users opt for a programatic
 build up of queries (because it's become easier to do..)
 rather than falling back on QueryParser in cases where it may
 not be the best choice.


 Sorry if I rambled too much :o)

 Dave


 This e-mail and any attachment is for authorised use by the
 intended recipient(s) only. It may contain proprietary
 material, confidential information and/or be subject to legal
 privilege. It should not be copied, disclosed to, retained or
 used by, any other party. If you are not an intended
 recipient then please promptly delete this e-mail and any
 attachment and all copies and inform the sender. Thank you.

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]





 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: java.io.IOException: Lock obtain timed out: Lock@/tmp/lucene-dcc982e203ef1d2aebb5d8a4b55b3a60-write.lock

2006-04-15 Thread Raghavendra Prabhu
You are creating two IndexWriters on the same directory

I guess that is the reason for the problem and one holds the lock

Rgds
Prabhu


On 4/15/06, Puneet Lakhina [EMAIL PROTECTED] wrote:

 Hi all,
 I am very new to lucene. I am using it in my application to index and
 serach
 through text files. And my program is more or less similar to the demo
 privided with lucene distribution.
 Initially everything was working fine without any problems. But today
 while
 running the application i have been getting this exception

 java.io.IOException: Lock obtain timed out: Lock@/tmp/lucene-
 dcc982e203ef1d2aebb5d8a4b55b3a60-write.lock

 whever i try to read or write to the index. I am unable to understand why
 this is happening. IS there some mistake I am making in the code.. because
 I
 havent changed any code, which was working smoothly up until today!!!

 My version of lucene is 1.9.1

 I deleted the index directory and tried again and voila now it works
 again!!
 But if I am going to be delivering my application I would really like to
 know why this was happening to guard against it..

 Thanks
 --
 Puneet




Re: Compass Framework

2006-04-08 Thread Raghavendra Prabhu
Database implementation of the index is always bound to be slow compared to
storing it on the filesystem.

Probably the group which stores indexes into Berkley DB should be able to
give you a performance measuer of what will happen you store indexex in
databases.

Rgds
Prabhu


On 4/8/06, Marios Skounakis [EMAIL PROTECTED] wrote:




 Hi all,

 I recently came across the Compass Framework, which is built on top of
 lucene. I
 am interested in it because it stores the lucene index in an RDBMS and
 provides
 transaction support for index updates (it also has several other features
 but
 this is the part I'm mostly interested in).

 I wanted to know if any people here have had any experience with compass
 and what
 they think about it. Is the database implementation of the index fast
 enough and
 does it introduce any additional issues/problems?

 Thanks in advance,
 Marios
  Msg sent via eXis webmail

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




Re: Compound Indexes Problem

2006-03-30 Thread Raghavendra Prabhu
Does changing the merge factor

and setting the options to SetUseCompoundfile(false)

split a  single index into multiple pieces.

Even i have been doing something similar and would like to know how it is
done

Rdgs
Prabhu


On 3/31/06, depsi programmer [EMAIL PROTECTED] wrote:

 Hello,
 Thanks for your responce.
 can you please guide me on how to break this single index into multiple
 pieces. when I try to do so it corrupts the index.
 I had created a index with max merge docs set to 10,000 with set compound
 indexes set to true. now I called optimize with max merge docs set to 100
 and the index was curropted
 Thanks
 Depsi

 Dennis Kubes [EMAIL PROTECTED] wrote: According to the Lucene In
 Action book you can convert from one compound to
 multi-file and vice versa by setting the setCompoundFile method to true or
 false.  But in running this myself I found that while I can convert from
 multi-file to compound, it doesn't convert back.  Here is the code that I
 used.

try {
  System.setProperty(org.apache.lucene.lockDir,
 lock-directory-path-here);
  String idxDir = index-directory-path-here;
  IndexWriter writer = new IndexWriter(idxDir, new StandardAnalyzer(),
 false);
  writer.setUseCompoundFile(false);
  writer.optimize();
  writer.close();
}
catch (IOException e) {
  // TODO Auto-generated catch block
  e.printStackTrace();
}

 Dennis

 -Original Message-
 From: depsi programmer [mailto:[EMAIL PROTECTED]
 Sent: Thursday, March 30, 2006 7:57 AM
 To: java-user@lucene.apache.org
 Subject: Compound Indexes Problem

 Hello,
 I am using lucene for storing details of my students. I have used
 SetUseCompoundFile(True) and optimised the indexes. Now I am not able to
 convert them back to their original form
 Thanks in advance
 Depsi


 -
 New Yahoo! Messenger with Voice. Call regular phones from your PC and save
 big.



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




 -
 How low will we go? Check out Yahoo! Messenger's low  PC-to-Phone call
 rates.



Re: Lucene indexing on Hadoop distributed file system

2006-03-26 Thread Raghavendra Prabhu
I would like to see lucene operate with hadoop

As you rightly pointed out, writing using FSDirectory to DFS would be a
performance issue.

I am interested in the idea. But i do not know how much time i can
contribute to this because of the little time which i can spare.

If anyone else is interested, can they join ? We can work on this together

Rgds
Prabhu


On 3/26/06, Igor Bolotin [EMAIL PROTECTED] wrote:

 In my current project we needed a way to create very large Lucene indexes
 on
 Hadoop distributed file system. When we tried to do it directly on DFS
 using
 Nutch FsDirectory class - we immediately found that indexing fails because
 DfsIndexOutput.seek() method throws UnsupportedOperationException. The
 reason for this behavior is clear - DFS does not support random updates
 and
 so seek() method can't be supported (at least not easily).

 Well, if we can't support random updates - the question is: do we really
 need them? Search in the Lucene code revealed 2 places which call
 IndexOutput.seek() method: one is in TermInfosWriter and another one in
 CompoundFileWriter. As we weren't planning to use CompoundFileWriter - the
 only place that concerned us was in TermInfosWriter.

 TermInfosWriter uses IndexOutput.seek() in its close() method to write
 total
 number of terms in the file back into the beginning of the file. It was
 very
 simple to change file format a little bit and write number of terms into
 last 8 bytes of the file instead of writing them into beginning of file.
 The
 only other place that should be fixed in order for this to work is in
 SegmentTermEnum constructor - to read this piece of information at
 position
 = file length - 8.

 With this format hack - we were able to use FsDirectory to write index
 directly to DFS without any problems. Well - we still don't index directly
 to DFS for performance reasons, but at least we can build small local
 indexes and merge them into the main index on DFS without copying big main
 index back and forth.

 If somebody is interested - I can post our changes in TermInfosWriter and
 SegmentTermEnum code, although they are pretty trivial.

 Best regards!
 Igor




Re: Can i use lucene to search the internet.

2006-03-23 Thread Raghavendra Prabhu
Hi

It can be used if you run cygwin (the latest version)
Please have a look at nutch wiki

And you are mailing the wrong list


Rgds
Prabhu

On 3/23/06, Babu, KameshNarayana (GE, Research, consultant) 
[EMAIL PROTECTED] wrote:

  Hai All,
 Can NUTCH be used in Windoes OS

 -Original Message-
 *From:* gekkokid [mailto:[EMAIL PROTECTED]
 *Sent:* Thursday, March 23, 2006 11:22 AM
 *To:* java-user@lucene.apache.org
 *Subject:* Re: Can i use lucene to search the internet.

 Hi, are you asking does it have a crawler? no it doesn't but nutch does
 http://lucene.apache.org/nutch/ :)

 _gk

 - Original Message -
 *From:* Babu, KameshNarayana (GE, Research, consultant)[EMAIL PROTECTED]
 *To:* java-user@lucene.apache.org
 *Sent:* Thursday, March 23, 2006 5:44 AM
 *Subject:* Can i use lucene to search the internet.



 hi all,
 Can i use lucene to search the internet. Are do we have nay open source
 applications. Thanks in advance

 [image: ole0.bmp]* GE Global Research*
 *Kamesh NarayanaBabu*
 *John F. Welch Technology Centre
 Information Technology Management, Plot 122, Export Promotion Industrial
 Park,
 Phase II, Hoodi Village, Whitefield Road, Bangalore, Karnataka - 560066,
 INDIA.
 Phone: +91 (80) 2503 0457 | GE Dial comm.: 8 * 901 0359 | Mobile: +91
 9986259850 | Email:-  [EMAIL PROTECTED]




Re: Read past EOF error in Windows

2006-03-23 Thread Raghavendra Prabhu
Check Whether it has got anything to do with UTF
There is a new line difference between windows and linux

Rgds
Prabhu


On 3/24/06, Chris Cain [EMAIL PROTECTED] wrote:

 No that doesnt seem to be the problem.

 Anyone have any other ideas?

 On Tue, 21 Mar 2006 [EMAIL PROTECTED]

 I had a problem in the past with security on the folder where your index
 is located...but your error does not seem to show that ... I would check
 anyway though...

 -Original Message-
 From: Chris Cain cbc20[at]hermes.cam.ac.uk
 To: java-user[at]lucene.apache.org
 Sent: Tue, 21 Mar 2006 15:33:26 + (GMT)
 Subject: Read past EOF error in Windows


 Hi all,

 I wrote a lucene program which runs fine under Linux and Mac but fails on
 most Windows machines. (I have managed to get it to work on one version of
 XP however)

 Specifically when i open or search the index i get the following error
 message.

 Any help would be appreciated,
 Cheers,
 Chris

 caught a class java.io.IOException
 with message: read past EOF
 java.io.IOException: read past EOF
 at org.apache.lucene.store.FSIndexInput.readInternal(FSDirectory.java:451)
 at
 org.apache.lucene.store.BufferedIndexInput.readBytes(
 BufferedIndexInput.java:45)
 at
 org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(
 CompoundFileReader.java:219)
 at
 org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java
 :64)
 at
 org.apache.lucene.store.BufferedIndexInput.readByte(
 BufferedIndexInput.java:33)
 at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:46)
 at org.apache.lucene.index.SegmentTermEnum.init(SegmentTermEnum.java:47)
 at org.apache.lucene.index.TermInfosReader.init(TermInfosReader.java:48)
 at
 org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:147)
 at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
 at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:115)
 at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:150)
 at org.apache.lucene.store.Lock$With.run(Lock.java:109)
 at org.apache.lucene.index.IndexReader.open(IndexReader.java:143)
 at org.apache.lucene.index.IndexReader.open(IndexReader.java:127)
 at org.apache.lucene.search.IndexSearcher.init(IndexSearcher.java:42)


 -
 To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
 For additional commands, e-mail: java-user-help[at]lucene.apache.org


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




Re: lucene highlighter

2006-03-22 Thread Raghavendra Prabhu
Hi Mark

 Currently both of the terms have the same score (weightage)

As you mentioned,  i would want it to be decreased so during the next run
for selecting second fragment, term1 has less weightage and term2 which has
not been selected has more weightage

Thanks
Rgds
Prabhu

On 3/22/06, mark harwood [EMAIL PROTECTED] wrote:

 How can i adjust the lucene highlighter to make sure
  that atleast each term is displayed in the query
 result


 First some, basic things to sanity check:

 * A classic problem: are you using compatible
 analyzers for tokenizing the query and the document
 content (both index time and highlight time)? Term2
 may not be being produced at all.

 * Are you selecting only one fragment and using a
 fragmenter implementation that means Term1 and Term2
 don't happen to fall within the scope of this single
 fragment?

 If both of these checks turn out OK I suspect what is
 happening is that term2 is weighted significantly less
 than term1 (based on idf and query boosts) and the
 highlighter may be continually selecting multiple
 fragments with term1 in preference to selecting any
 fragments which only contain the lower scoring term2.

 If this is the case and you really want to ensure that
 term2 gets shown then you can use a custom Scorer
 implementation that influences the highlighter
 according to your preferences. Such an implementation
 could, for example, score fragments that are merely
 repetitions of the same hits (ie your term1) with a
 decreasing value. This would then allow the fragments
 with term2 to be considered more strongly for
 selection.


 Hope this helps
 Mark




 ___
 To help you stay safe and secure online, we've developed the all new
 Yahoo! Security Centre. http://uk.security.yahoo.com

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




Adaptive fetch schedule

2006-03-22 Thread Raghavendra Prabhu
Hi

Does the inlink value problem solve the OPIC problem which was there.

That is on a recrawl, the page would have a higher score.

Does this fix that problem?

Rgds
Prabhu


wildcard support

2006-03-21 Thread Raghavendra Prabhu
Hi

I am using the highlightertest.java to extract the wild card terms.

I use the queryParser to parse my queryString

Then i store the text in a RAM directory ( which i want to scan)

and then rewrite it as mentioned in the highlighter example

query=query.rewrite(reader)

Now if i print the query, i see that it does not contain anything

I repeat the query does not contain anything

What is the reason for this problem?  Any help would be appreciated.

Rgds
Prabhu


lucene highlighter

2006-03-21 Thread Raghavendra Prabhu
Hi guys

If anyone can tell me how to get the best fragments using the highligher

The query has two terms  - term1 and term2

The search result display only term1 in the highlighter whereas term2 is
also there. How can i adjust the lucene highlighter to make sure that
atleast each term is displayed in the query result

Rgds
Prabhu


rewriting a query doubt

2006-03-20 Thread Raghavendra Prabhu
Hi

When you rewrite a query using

query=query.rewrite(reader)

Does the query change automtically.

For example if the query was n*w and the reader has new,now,noow

Does the query change to new,now,noow

Can someone tell me how it works

Rgds
Prabhu


lucene query analysis

2006-03-14 Thread Raghavendra Prabhu
Hi

The problem which i am facing is that the query is Case Sensitive

If i type in BIG letters i am not able to see answers and if  i type in
small letters i am able to see results

Is there anything by which i can do a case conversion

Now i am using a WhiteSpaceAnalyser . What Analyser should change it to ?


Rgds
Prabhu


query parser

2006-03-08 Thread Raghavendra Prabhu
I want to use query parser to parse my query string

But the default field should be a group of fields with different fields
where it is searched on

Can any one let me know

For example if my query is

new books

new should be searched in different fields ( content and title)

books should be searched in different fields ( content and title)


How do i accomplish this and how can i extend querparser to do the above


Re: query parser

2006-03-08 Thread Raghavendra Prabhu
Hi Rainer

Thanks. I have one more doubt.

How do i set different boosts for each field using query parser

Can i set different boosts for each field?

Rgds
Prabhu

On 3/8/06, Rainer Dollinger [EMAIL PROTECTED] wrote:

 Take a look at the class MultiFieldQueryParser, I think it does exactly
 what you want.

 GR,
 Rainer


 Raghavendra Prabhu wrote:
  I want to use query parser to parse my query string
 
  But the default field should be a group of fields with different fields
  where it is searched on
 
  Can any one let me know
 
  For example if my query is
 
  new books
 
  new should be searched in different fields ( content and title)
 
  books should be searched in different fields ( content and title)
 
 
  How do i accomplish this and how can i extend querparser to do the above
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




MuliField Query Parser

2006-03-08 Thread Raghavendra Prabhu
Hi

I need different boosts for fields which we define in multifield query
parser

How can this be accomplished??


Rgds
Prabhu


Re: Throughput doesn't increase when using more concurrent threads

2006-02-23 Thread Raghavendra Prabhu
Can nutch be made to use lucene query parser?

Rgds
Prabhu


On 2/23/06, Peter Keegan [EMAIL PROTECTED] wrote:

 Hi Otis,

 The Lucene server is actually CPU and network bound, as the index gets
 memory mapped pretty quickly. There is little disk activity observed.

 I was also able to run the server on a Sun box last night with 4 dual core
 opterons (same Linux and JVM) and I'm observing query rates of 400 qps!
 Has
 Linux been optimized to run on this hardware? I imagine that Sun's JVM has
 been.

 Peter

 On 2/22/06, Otis Gospodnetic [EMAIL PROTECTED] wrote:
 
  Hi,
 
  Some things that could be different:
  - thread scheduling (shouldn't make too much of a difference though)
 
  --- I would also play with disk IO schedulers, if you can.  CentOS is
  based on RedHat, I believe, and RedHat (ext3, really) now has about 4
  different IO schedulers that, according to articles I recently read, can
  have an impact on disk read/write performance.  These schedules can be
  specified at mount time, I believe, and maybe at boot time (kernel line
 in
  Grub/LILO).
 
  Otis
 
 
  On 2/22/06, Peter Keegan [EMAIL PROTECTED] wrote:
   I am doing a performance comparison of Lucene on Linux vs Windows.
  
   I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
   processors, 64GB RAM). One is running CentOS 4 Linux, the other is
  running
   Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from
  Sun.
   The Lucene server is using MMapDirectory. I'm running the jvm with
   -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
 7.8GBon
   windows.
  
   I'm observing query rates of 330 queries/sec on the Wintel server, but
  only
   200 qps on the Linux box. At first, I suspected a network bottleneck,
  but
   when I 'short-circuited' Lucene, the query rates were identical.
  
   I suspect that there are some things to be tuned in Linux, but I'm not
  sure
   what. Any advice would be appreciated.
  
   Peter
  
  
  
   On 1/30/06, Peter Keegan [EMAIL PROTECTED] wrote:
   
I cranked up the dial on my query tester and was able to get the
 rate
  up
to 325 qps. Unfortunately, the machine died shortly thereafter
 (memory
errors :-( ) Hopefully, it was just a coincidence. I haven't
 measured
  64-bit
indexing speed, yet.
   
Peter
   
On 1/29/06, Daniel Noll [EMAIL PROTECTED] wrote:

 Peter Keegan wrote:
  I tried the AMD64-bit JVM from Sun and with MMapDirectory and
 I'm
  now
  getting 250 queries/sec and excellent cpu utilization (equal
 concurrency on
  all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
  wasn't
 aware
  of it.
 
 Wow.  That's fast.

 Out of interest, does indexing time speed up much on 64-bit
  hardware?
 I'm particularly interested in this side of things because for our
  own
 application, any query response under half a second is good
 enough,
  but
 the indexing side could always be faster. :-)

 Daniel

 --
 Daniel Noll

 Nuix Australia Pty Ltd
 Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
 Phone: (02) 9280 0699
 Fax:   (02) 9212 6902

 This message is intended only for the named recipient. If you are
  not
 the intended recipient you are notified that disclosing, copying,
 distributing or taking any action in reliance on the contents of
  this
 message or attachment is strictly prohibited.



  -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


   
  
  
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 




Re: Throughput doesn't increase when using more concurrent threads

2006-02-23 Thread Raghavendra Prabhu
Hi

Sorry for the trouble

I was sending my first mail to the group

and replied to this thread and then later on sent a direct mail.

I would like to apologise for the inconvenience caused.

Rgds
Prabhu


On 2/23/06, Otis Gospodnetic [EMAIL PROTECTED] wrote:

 Hi,

 Please ask on the Nutch mailing list (I answered your question in general@
 already).
 Also, please don't steal other people's threads - it's considered inpolite
 for obvious reasons.

 Otis


 - Original Message 
 From: Raghavendra Prabhu [EMAIL PROTECTED]
 To: java-user@lucene.apache.org
 Sent: Thursday, February 23, 2006 11:10:11 AM
 Subject: Re: Throughput doesn't increase when using more concurrent
 threads

 Can nutch be made to use lucene query parser?

 Rgds
 Prabhu


 On 2/23/06, Peter Keegan [EMAIL PROTECTED] wrote:
 
  Hi Otis,
 
  The Lucene server is actually CPU and network bound, as the index gets
  memory mapped pretty quickly. There is little disk activity observed.
 
  I was also able to run the server on a Sun box last night with 4 dual
 core
  opterons (same Linux and JVM) and I'm observing query rates of 400 qps!
  Has
  Linux been optimized to run on this hardware? I imagine that Sun's JVM
 has
  been.
 
  Peter
 
  On 2/22/06, Otis Gospodnetic [EMAIL PROTECTED] wrote:
  
   Hi,
  
   Some things that could be different:
   - thread scheduling (shouldn't make too much of a difference though)
  
   --- I would also play with disk IO schedulers, if you can.  CentOS is
   based on RedHat, I believe, and RedHat (ext3, really) now has about 4
   different IO schedulers that, according to articles I recently read,
 can
   have an impact on disk read/write performance.  These schedules can be
   specified at mount time, I believe, and maybe at boot time (kernel
 line
  in
   Grub/LILO).
  
   Otis
  
  
   On 2/22/06, Peter Keegan [EMAIL PROTECTED] wrote:
I am doing a performance comparison of Lucene on Linux vs Windows.
   
I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
processors, 64GB RAM). One is running CentOS 4 Linux, the other is
   running
Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs
 from
   Sun.
The Lucene server is using MMapDirectory. I'm running the jvm with
-Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
  7.8GBon
windows.
   
I'm observing query rates of 330 queries/sec on the Wintel server,
 but
   only
200 qps on the Linux box. At first, I suspected a network
 bottleneck,
   but
when I 'short-circuited' Lucene, the query rates were identical.
   
I suspect that there are some things to be tuned in Linux, but I'm
 not
   sure
what. Any advice would be appreciated.
   
Peter
   
   
   
On 1/30/06, Peter Keegan [EMAIL PROTECTED] wrote:

 I cranked up the dial on my query tester and was able to get the
  rate
   up
 to 325 qps. Unfortunately, the machine died shortly thereafter
  (memory
 errors :-( ) Hopefully, it was just a coincidence. I haven't
  measured
   64-bit
 indexing speed, yet.

 Peter

 On 1/29/06, Daniel Noll [EMAIL PROTECTED] wrote:
 
  Peter Keegan wrote:
   I tried the AMD64-bit JVM from Sun and with MMapDirectory and
  I'm
   now
   getting 250 queries/sec and excellent cpu utilization (equal
  concurrency on
   all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
   wasn't
  aware
   of it.
  
  Wow.  That's fast.
 
  Out of interest, does indexing time speed up much on 64-bit
   hardware?
  I'm particularly interested in this side of things because for
 our
   own
  application, any query response under half a second is good
  enough,
   but
  the indexing side could always be faster. :-)
 
  Daniel
 
  --
  Daniel Noll
 
  Nuix Australia Pty Ltd
  Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
  Phone: (02) 9280 0699
  Fax:   (02) 9212 6902
 
  This message is intended only for the named recipient. If you
 are
   not
  the intended recipient you are notified that disclosing,
 copying,
  distributing or taking any action in reliance on the contents of
   this
  message or attachment is strictly prohibited.
 
 
 
   -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail:
 [EMAIL PROTECTED]
 
 

   
   
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
  
  
  
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
  
 
 




 -
 To unsubscribe, e-mail: [EMAIL PROTECTED