Re: Tagging using SOLR

2007-09-07 Thread Erik Hatcher


On Sep 7, 2007, at 3:09 AM, Doss wrote:
Thanks for the guidelines, but basically our idea is to build a  
system like
http://del.icio.us/tag/, is it possible to take counts of similar  
words from

a solr indexed field?


How do you define similar words?

Here's a tag cloud for a single user in Collex, a system I built:

http://www.nines.org/permalink/cloud/tag/nowviskie

Collex is using SOLR-139 now for tagging/annotating.

You will need to think through handling updates to your documents,  
and whether you will have user-specific tags too.  Reading up on the  
links I sent you and doing some experimenting is highly recommended.   
It's a non-trivial  scenario with Solr at this time.


By the way, a del.icio.us competitor, Simpy, is built on Lucene - so  
it is quite possible to build a heavy duty tagging system, but the  
devil is in the details.


Erik





Thanks,
Mohandoss


On 9/6/07, Erik Hatcher [EMAIL PROTECTED] wrote:



On Sep 6, 2007, at 3:29 AM, Doss wrote:

We are running an appalication built using SOLR, now we are trying
to build
a tagging system using the existing SOLR indexed field called
tag_keywords, this field has different keywords seperated by
comma, please
give suggestions on how can we build tagging system using this  
field?


There is also a wiki page on some brainstorming on how to implement
tagging within Solr: http://wiki.apache.org/solr/UserTagDesign

It's easy enough to have a tag_keywords field, but updating a single
tag_keywords field is not so straightforward without sending the
entire document to Solr every time it is tagged.  See SOLR-139's
extensive comments and patches to see what you're getting into.

   Erik






Re: Replication broken.. no helpful errors?

2007-09-07 Thread Bill Au
As I had pointed out in my first reply to this thread, you had a directory
named temp-snapshot.20070816120113
in your data directory on the slave.  Snapinstaller was mistakenly treating
that as the lastest snapshot and was installing that every time it was
called.  Snapinstaller didn't trigger a commit since the same snapshot had
already been installed.

 latest snapshot /opt/solr/data/temp-snapshot.20070816120113 already
 installed

I have open a bug to improve snapinstaller:

http://issues.apache.org/jira/browse/SOLR-346

Bill


On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:

 Well, I've been playing around with it (removed all the snapshots,
 restarted tomcat) and it seems like it works now.. maybe.

 I was noticing that search2 and search3, the slaves, had searchers
 that had been opened several days ago - when we do several 100
 commits and 2 optimizes on search1, the master, every day.

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 On Sep 6, 2007, at 12:37 PM, Yonik Seeley wrote:

  On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:
  The thing is that a new searcher is not opened if I look in the
  stats.jsp page. The index version never changes.
 
  The index version is read from the index... hence if the lucene index
  doesn't change (even if a ew snapshot was taken), the version won't
  change even if a new searcher was opened.
 
  Is the problem on the master side now since it looks like the slave is
  pulling a temp-snapshot?
 
  -Yonik
 




Re: Distribution Information?

2007-09-07 Thread Bill Au
I that case, definitely take a look at SOLR-333:

http://issues.apache.org/jira/browse/SOLR-333

On the master there should be a logs/clients directory.  Do you have any
files in there?

Bill

On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:

 Well, I do get...

 Distribution Info
 Master Server

 No distribution info present

 ...

 But there appears to be no information filled in.

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 On Sep 6, 2007, at 6:09 AM, Bill Au wrote:

  That is very strange.  Even if there is something wrong with the
  config or
  code, the static HTML contained in distributiondump.jsp should show
  up.
 
  Are you using the latest version of the JSP?  There has been a
  recent fix:
 
  http://issues.apache.org/jira/browse/SOLR-333
 
  Bill
 
  On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  When I load the distrobutiondump.jsp, there is no output in my
  catalina.out file.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:
 
  Not that I've noticed. I'll do a more careful grep soon here - I
  just got back from a long weekend.
 
  ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
  ++
 
 
  On Aug 31, 2007, at 6:12 PM, Bill Au wrote:
 
  Are there any error message in your appserver log files?
 
  Bill
 
  On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
  Hello!
 
  /solr/admin/distributiondump.jsp
 
  This server is set up as a master server, and other servers use
  the
  replication scripts to pull updates from it every few minutes. My
  distribution information screen is blank.. and I couldn't find any
  information on fixing this in the wiki.
 
  Any chance someone would be able to explain how to get this page
  working, or what I'm doing wrong?
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
 
 
 
 
 




Re: Indexing very large files.

2007-09-07 Thread Brian Carmalt

Lance Norskog schrieb:

Now I'm curious: what is the use case for documents this large?

Thanks,

Lance Norskog


  
It is a rand use case, but could become relevant for us. I was told to 
explore the possibilities, and that's what I'm doing. :)


Since I haven't heard any suggestions as to how to do this with a stock 
Solr install, other than increase vm memory, I'll assume it will have to 
be done

with a custom solution.

Thanks for the answers and the interest.

Brian


Re: Indexing very large files.

2007-09-07 Thread Walter Underwood
Legal discovery can have requirements like this. --wunder

On 9/7/07 4:47 AM, Brian Carmalt [EMAIL PROTECTED] wrote:

 Lance Norskog schrieb:
 Now I'm curious: what is the use case for documents this large?
 
 Thanks,
 
 Lance Norskog
 
 
   
 It is a rand use case, but could become relevant for us. I was told to
 explore the possibilities, and that's what I'm doing. :)
 
 Since I haven't heard any suggestions as to how to do this with a stock
 Solr install, other than increase vm memory, I'll assume it will have to
 be done
 with a custom solution.
 
 Thanks for the answers and the interest.
 
 Brian



Dilbert (off-topic)

2007-09-07 Thread Jeff Rodenburg
It may be off-topic, but it's friday and thought all the java coders would
appreciate today's dilbert.  (I'm not primary a java dev, but I know the
feeling)

http://www.dilbert.com/comics/dilbert/archive/dilbert-20070907.html

cheers,
jeff r.


Re: RSS syndication Plugin

2007-09-07 Thread Thorsten Scherler
On Thu, 2007-09-06 at 09:07 -0400, Ryan McKinley wrote:
 perhaps:
 https://issues.apache.org/jira/browse/SOLR-208
 
 in http://svn.apache.org/repos/asf/lucene/solr/trunk/example/solr/conf/xslt/
 
 check:
 example_atom.xsl
 example_rss.xsl

Awesome.

Thanks very much Ryan to point me into the right direction and Brian
Whitman for his contribution.

salu2
-- 
Thorsten Scherler thorsten.at.apache.org
Open Source Java  consulting, training and solutions



Re: Tagging using SOLR

2007-09-07 Thread Doss
Dear Thorsten, Erik,

Thanks for the guidelines, but basically our idea is to build a system like
http://del.icio.us/tag/, is it possible to take counts of similar words from
a solr indexed field?

Thanks,
Mohandoss


On 9/6/07, Erik Hatcher [EMAIL PROTECTED] wrote:


 On Sep 6, 2007, at 3:29 AM, Doss wrote:
  We are running an appalication built using SOLR, now we are trying
  to build
  a tagging system using the existing SOLR indexed field called
  tag_keywords, this field has different keywords seperated by
  comma, please
  give suggestions on how can we build tagging system using this field?

 There is also a wiki page on some brainstorming on how to implement
 tagging within Solr: http://wiki.apache.org/solr/UserTagDesign

 It's easy enough to have a tag_keywords field, but updating a single
 tag_keywords field is not so straightforward without sending the
 entire document to Solr every time it is tagged.  See SOLR-139's
 extensive comments and patches to see what you're getting into.

Erik




Lucene/Solr OnTheRoad

2007-09-07 Thread Erik Hatcher
I just added brief mentions of some upcoming Lucene/Solr-related  
events to this page:


  http://wiki.apache.org/lucene-java/OnTheRoad

Below is some self-promotion of an upcoming class I have agreed to  
teach.  It's uncomfortable to send this sort of thing out, but if I  
don't then you might never stumble up on it since its being marketed  
in a low key way.


I'll be teaching a week long Lucene/Solr course October 28-November  
2.  More details here:


http://opensourceretreat.com/courses/category/solr/


marketing
Want to get on the cutting edge of Solr and Lucene? Join me and a few  
crack developers between October 28 and November 2 at the Four  
Diamond Boar's Head Inn in beautiful Charlottesville, Virginia!  
Imagine five all-inclusive days of training, mentoring, and maybe  
even some recreation at the Open Source Retreat. Find out more at  
http://www.opensourceretreat.com.


What are you going to get?
In depth training including
(Your bullet points)

Plus our signature features:

* Class size limited to 25 students
* One-on-one instructor time with me!
* Meals and lodging included in the course price

There is a $250 discount off the price of the course for signing up  
21 days in advance of the course start!


Sign up today at http://opensourceretreat.com/courses/category/solr/!  
Space is limited!


About the Boar's Head Inn

Spa
The spa offers more than 30 distinct services including therapeutic  
massage, LaStone, and more.

Birdwood Golf Course
Enjoy a round of golf at our Birdwood Golf Course, which has been  
recognized by Washington Golf Monthly as one of the Mid-Atlantic's  
top 100 courses.

Dining
Dining at the Inn is a Four-Diamond experience. The Old Mill Room has  
received this prestigious award for nineteen consecutive years.

Vineyards
Tour area wineries and taste some of the finest vintages produced in  
the country. Many premier Virginia vineyards are located within a  
short drive from Boar's Head Inn.

/marketing


Re: Distribution Information?

2007-09-07 Thread Matthew Runo

Actually I don't have the clients directory...

[EMAIL PROTECTED]: .../logs]$ pwd
/opt/solr/logs
[EMAIL PROTECTED]: .../logs]$ ls
rsyncd-enabled  rsyncd.log  rsyncd.pid  snapcleaner.log   
snapshooter.log  snapshot.current.search2  snapshot.status.search2

[EMAIL PROTECTED]: .../logs]$


It does look like it could be a path issue. I wonder why, though, no  
clients sub directory was created.


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Sep 7, 2007, at 7:43 AM, Bill Au wrote:


I that case, definitely take a look at SOLR-333:

http://issues.apache.org/jira/browse/SOLR-333

On the master there should be a logs/clients directory.  Do you  
have any

files in there?

Bill

On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:


Well, I do get...

Distribution Info
Master Server

No distribution info present

...

But there appears to be no information filled in.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 6, 2007, at 6:09 AM, Bill Au wrote:


That is very strange.  Even if there is something wrong with the
config or
code, the static HTML contained in distributiondump.jsp should show
up.

Are you using the latest version of the JSP?  There has been a
recent fix:

http://issues.apache.org/jira/browse/SOLR-333

Bill

On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:


When I load the distrobutiondump.jsp, there is no output in my
catalina.out file.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:


Not that I've noticed. I'll do a more careful grep soon here - I
just got back from a long weekend.

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Aug 31, 2007, at 6:12 PM, Bill Au wrote:


Are there any error message in your appserver log files?

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:

Hello!

/solr/admin/distributiondump.jsp

This server is set up as a master server, and other servers use
the
replication scripts to pull updates from it every few  
minutes. My
distribution information screen is blank.. and I couldn't  
find any

information on fixing this in the wiki.

Any chance someone would be able to explain how to get this page
working, or what I'm doing wrong?

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++

















Return 2 fields per facet.. name and id, for example?

2007-09-07 Thread Matthew Runo

Hello!

I've found something which is either already in SOLR, or should be  
(as I can see it being very helpful). I couldn't figure out how to do  
it though..


Lets say I'm trying to print out a page of products, and I want to  
provide a list of brands to filter by. It would be great if in my  
facets I could get this sort of xml...



int name=adidas id=145/int

That way, I'd be able to know the brand id of adidas without having  
to run a second query somewhere for each facet to look it up. Without  
having this, I'd have to pass fq=brand:adidas to make this work, and  
with some brands having strange character's in their names.. ├ęS, for  
example, passing the name in the URL can be annoying =p


Any way to do this?

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++




Re: Return 2 fields per facet.. name and id, for example?

2007-09-07 Thread Matthew Runo

Ahh... sneaky. I'll probably do the combined-name#id method.

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Sep 7, 2007, at 12:38 PM, Yonik Seeley wrote:


On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote:

I've found something which is either already in SOLR, or should be
(as I can see it being very helpful). I couldn't figure out how to do
it though..

Lets say I'm trying to print out a page of products, and I want to
provide a list of brands to filter by. It would be great if in my
facets I could get this sort of xml...


int name=adidas id=145/int

That way, I'd be able to know the brand id of adidas without having
to run a second query somewhere for each facet to look it up.


If you can get the name from the id in your webapp, then index the id
to begin with (instead of the name).
int name=145/int

 Or, if you need both the name and the id, index them both together,
separated by a special character that you can strip out on the webapp
side...

int name=adidas#145/int

-Yonik





Re: Solr and KStem

2007-09-07 Thread Walter Underwood
Even if KStem isn't ASL, we could include the plug-in code
with notes about how to get the stemmer. Or, the Solr plug-in
could be contributed to the group that manages the KStem
distribution:

  http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi

wunder

On 9/7/07 12:59 PM, Yonik Seeley [EMAIL PROTECTED] wrote:

 On 9/7/07, Wagner,Harry [EMAIL PROTECTED] wrote:
 I've implemented a Solr plug-in that wraps KStem for Solr use.  KStem is
 considered to be more appropriate for library usage since it is much
 less aggressive than Porter (i.e., searches for organization do NOT
 match on organ!). If there is any interest in feeding this back into
 Solr I would be happy to contribute it.
 
 Absolutely.
 We need to make sure that the license for that k-stemmer is ASL
 compatible of course.
 
 -Yonik



Re: Indexing very large files.

2007-09-07 Thread Mike Klaas

On 7-Sep-07, at 4:47 AM, Brian Carmalt wrote:


Lance Norskog schrieb:

Now I'm curious: what is the use case for documents this large?




It is a rand use case, but could become relevant for us. I was told  
to explore the possibilities, and that's what I'm doing. :)


Since I haven't heard any suggestions as to how to do this with a  
stock Solr install, other than increase vm memory, I'll assume it  
will have to be done

with a custom solution.


Well, have you tried the CSV importer?

-Mike


FW: Space costs of dynamic fields?

2007-09-07 Thread Lance Norskog
Are there any extra costs for dynamic v.s. static fields? That is, if I have
the same dynamic field in 95% of my documents, should I just make it static
and empty in the other 5%? Will query speed or change? Which choice will use
more pace?
 
Otherwise, the only downside of dynamic fields is that you can't say, give
me fields a*_t but not b*_t in a query. I haven't found others in the mail
archives or the wiki.
 
Thanks,
 
Lance Norskog


org.apache.lucene.util.English missing

2007-09-07 Thread Lance Norskog
Hi folks-
 
The Lucene Spellchecker unit test expects a Java class
org.apache.lucene.util.English. I can't find it in the source trees on
svn.apache.org. Can someone please mail it to me?
 
Thanks,
 
Lance Norskog


FW: Minor mistake on the Wiki

2007-09-07 Thread Lance Norskog
In the page http://wiki.apache.org/solr/UpdateXmlMessages

We find:

Optional attributes on doc

*   boost = float - default is 1.0 (See Lucene docs for
definition of boost.) 
*   NOTE: make sure norms are enabled (omitNorms=false
in the schema.xml) for any fields where the index-time boost should be
stored. 

This NOTE appears to be block-copied from the following entry about
field-level boosts, and makes no sense here.

Lance Norskog




adding without overriding dups - DirectUpdateHandler2.java does not implement?

2007-09-07 Thread Lance Norskog
Hi-
 
It appears that DirectUpdateHandler2.java does not actually implement the
parameters that control whether to override existing documents. Should I use
DirectUpdateHandler instead? Apparently DUH is slower than DUH2, but DUH
implements these parameters.  (We do so many overwrites that switching to
DUH is probably a win.)

From DirectUpdateHandler2.java:addDoc()
 
if (!cmd.allowDups  !cmd.overwritePending 
!cmd.overwriteCommitted) {
  throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,unsupported param combo: + cmd);
  // this would need a reader to implement (to be able to check
committed
  // before adding.)
  // return addNoOverwriteNoDups(cmd);
} else if (!cmd.allowDups  !cmd.overwritePending 
cmd.overwriteCommitted) {
  rc = addConditionally(cmd);
} else if (!cmd.allowDups  cmd.overwritePending 
!cmd.overwriteCommitted) {
  throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,unsupported param combo: + cmd);
} else if (!cmd.allowDups  cmd.overwritePending 
cmd.overwriteCommitted) {
  rc = overwriteBoth(cmd);
} else if (cmd.allowDups  !cmd.overwritePending 
!cmd.overwriteCommitted) {
  rc = allowDups(cmd);
} else if (cmd.allowDups  !cmd.overwritePending 
cmd.overwriteCommitted) {
  throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,unsupported param combo: + cmd);
} else if (cmd.allowDups  cmd.overwritePending 
!cmd.overwriteCommitted) {
  throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,unsupported param combo: + cmd);
} else if (cmd.allowDups  cmd.overwritePending 
cmd.overwriteCommitted) {
  rc = overwriteBoth(cmd);
}



Re: FW: Minor mistake on the Wiki

2007-09-07 Thread Yonik Seeley
On 9/7/07, Lance Norskog [EMAIL PROTECTED] wrote:
 In the page http://wiki.apache.org/solr/UpdateXmlMessages

 We find:

 Optional attributes on doc

 *   boost = float - default is 1.0 (See Lucene docs for
 definition of boost.)
 *   NOTE: make sure norms are enabled (omitNorms=false
 in the schema.xml) for any fields where the index-time boost should be
 stored.

 This NOTE appears to be block-copied from the following entry about
 field-level boosts, and makes no sense here.

Perhaps it could be worded better, but there is some sense behind it.
There is no document boost in a lucene lindex... a doc boost is simply
multipled into the boost for each field as the document is indexed.

-Yonik


Re: org.apache.lucene.util.English missing

2007-09-07 Thread Otis Gospodnetic
Really?  Weird.
It's here:
/home/otis/dev/repos/lucene/java/trunk
[EMAIL PROTECTED] trunk]$ ff English.java
./src/test/org/apache/lucene/util/English.java

Note that this is Lucene and that it's src/test.

Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Lance Norskog [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Friday, September 7, 2007 4:33:03 PM
Subject: org.apache.lucene.util.English missing

Hi folks-
 
The Lucene Spellchecker unit test expects a Java class
org.apache.lucene.util.English. I can't find it in the source trees on
svn.apache.org. Can someone please mail it to me?
 
Thanks,
 
Lance Norskog





Re: adding without overriding dups - DirectUpdateHandler2.java does not implement?

2007-09-07 Thread Yonik Seeley
On 9/7/07, Lance Norskog [EMAIL PROTECTED] wrote:
 It appears that DirectUpdateHandler2.java does not actually implement the
 parameters that control whether to override existing documents.

It's been proposed that most of these be deprecated anyway and
replaced with a simple overwrite=true/false.  Are you trying to do
something different than standard overwriting?

-Yonik


Re: Solr and KStem

2007-09-07 Thread Otis Gospodnetic
Look for KStem in Lucene JIRA.  Mny years ago something KStem related was 
contributed, and there was a discussion about licenses then.

Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Walter Underwood [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Friday, September 7, 2007 4:31:25 PM
Subject: Re: Solr and KStem

Even if KStem isn't ASL, we could include the plug-in code
with notes about how to get the stemmer. Or, the Solr plug-in
could be contributed to the group that manages the KStem
distribution:

  http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi

wunder

On 9/7/07 12:59 PM, Yonik Seeley [EMAIL PROTECTED] wrote:

 On 9/7/07, Wagner,Harry [EMAIL PROTECTED] wrote:
 I've implemented a Solr plug-in that wraps KStem for Solr use.  KStem is
 considered to be more appropriate for library usage since it is much
 less aggressive than Porter (i.e., searches for organization do NOT
 match on organ!). If there is any interest in feeding this back into
 Solr I would be happy to contribute it.
 
 Absolutely.
 We need to make sure that the license for that k-stemmer is ASL
 compatible of course.
 
 -Yonik






RE: adding without overriding dups - DirectUpdateHandler2.java does not implement?

2007-09-07 Thread Lance Norskog
No, I'm just doing standard overwriting. It just took a little digging to be
able to do it :)
To gild the lily, it would be efficient in our case to add a boolean flag to
each record saying whether to overwrite this record.
This would make each record read-only or read-write. But I think this is an
unusual use case, so we will do it our own way.

Thanks for your time,

Lance 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley
Sent: Friday, September 07, 2007 1:40 PM
To: solr-user@lucene.apache.org
Subject: Re: adding without overriding dups - DirectUpdateHandler2.java does
not implement?

On 9/7/07, Lance Norskog [EMAIL PROTECTED] wrote:
 It appears that DirectUpdateHandler2.java does not actually implement 
 the parameters that control whether to override existing documents.

It's been proposed that most of these be deprecated anyway and replaced with
a simple overwrite=true/false.  Are you trying to do something different
than standard overwriting?

-Yonik



Re: adding without overriding dups - DirectUpdateHandler2.java does not implement?

2007-09-07 Thread Yonik Seeley
On 9/7/07, Lance Norskog [EMAIL PROTECTED] wrote:
 No, I'm just doing standard overwriting. It just took a little digging to be
 able to do it :)

Overwriting is the default... you shouldn't have to do specify
anything extra when indexing the document.

-Yonik


Re: Distribution Information?

2007-09-07 Thread Bill Au
I just double checked distribution.jsp.  The directory where it looks for
status files is hard coded to logs/clients.  So for now master_status_dir in
your solr/conf/scripts.conf has to be set to that so the scripts will put
the status files there.  It looks like they are currently in you logs
directory.  The status files are snapshot.current.search2 and
snapshot.status.search2.

Bill

On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote:

 Actually I don't have the clients directory...

 [EMAIL PROTECTED]: .../logs]$ pwd
 /opt/solr/logs
 [EMAIL PROTECTED]: .../logs]$ ls
 rsyncd-enabled  rsyncd.log  rsyncd.pid  snapcleaner.log
 snapshooter.log  snapshot.current.search2  snapshot.status.search2
 [EMAIL PROTECTED]: .../logs]$


 It does look like it could be a path issue. I wonder why, though, no
 clients sub directory was created.

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 On Sep 7, 2007, at 7:43 AM, Bill Au wrote:

  I that case, definitely take a look at SOLR-333:
 
  http://issues.apache.org/jira/browse/SOLR-333
 
  On the master there should be a logs/clients directory.  Do you
  have any
  files in there?
 
  Bill
 
  On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  Well, I do get...
 
  Distribution Info
  Master Server
 
  No distribution info present
 
  ...
 
  But there appears to be no information filled in.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 6, 2007, at 6:09 AM, Bill Au wrote:
 
  That is very strange.  Even if there is something wrong with the
  config or
  code, the static HTML contained in distributiondump.jsp should show
  up.
 
  Are you using the latest version of the JSP?  There has been a
  recent fix:
 
  http://issues.apache.org/jira/browse/SOLR-333
 
  Bill
 
  On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  When I load the distrobutiondump.jsp, there is no output in my
  catalina.out file.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:
 
  Not that I've noticed. I'll do a more careful grep soon here - I
  just got back from a long weekend.
 
  ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
  ++
 
 
  On Aug 31, 2007, at 6:12 PM, Bill Au wrote:
 
  Are there any error message in your appserver log files?
 
  Bill
 
  On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
  Hello!
 
  /solr/admin/distributiondump.jsp
 
  This server is set up as a master server, and other servers use
  the
  replication scripts to pull updates from it every few
  minutes. My
  distribution information screen is blank.. and I couldn't
  find any
  information on fixing this in the wiki.
 
  Any chance someone would be able to explain how to get this page
  working, or what I'm doing wrong?
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
 
 
 
 
 
 
 




Re: adding without overriding dups - DirectUpdateHandler2.java does not implement?

2007-09-07 Thread Mike Klaas

On 7-Sep-07, at 1:35 PM, Lance Norskog wrote:


Hi-

It appears that DirectUpdateHandler2.java does not actually  
implement the
parameters that control whether to override existing documents.  
Should I use


No?  allowDups=true ovewritePending=false overwriteCommited=false  
should result in adding docs with no overwriting with DUH2.


As yonik said, overwriting is the default behaviour.  It is based on  
uniqueKey, which must be defined for overwriting to work.


DirectUpdateHandler instead? Apparently DUH is slower than DUH2,  
but DUH
implements these parameters.  (We do so many overwrites that  
switching to

DUH is probably a win.)


DUH also does not implement many newer update features, like autoCommit.

-Mike


New user question: How to show all stored fields in a result

2007-09-07 Thread melkink

Hello Solr Folks,

I'm a new solr user and I'm running into a frustrating problem.  I'm sure
it's a simple solution I just don't have the experience with solr to know
the correct way around it.

I currently have approximately 600 documents stored and indexed in solr. 
Each document has some level of associated metadata. I can query the solr
index with no problems but I can't seem to get a full set   of metadata with
my search results.  For example, if I search on the text of the stored
document all that is sent back is the score and the id of the hit.

Query:  

http://localhost:8983/solr/select?indent=onversion=2.2q=doctext%3Ashakespearestart=0rows=10fl=*qt=standardwt=standardexplainOther=hl.fl=

Partial Result:

response
lst name=responseHeader
int name=status0/int
int name=QTime1/int
lst name=params
str name=explainOther/
str name=fl*/str
str name=indenton/str
str name=start0/str
str name=qdoctext:shakespeare/str
str name=hl.fl/
str name=qtstandard/str
str name=wtstandard/str
str name=version2.2/str
str name=rows10/str
/lst
/lst
result name=response numFound=470 start=0
doc
str name=id2eb453aab5101de037ba0747139ebd27/str
/doc
...
/result
/response

However, If I search the authors_primary field I get the full metadata
listing:

Query:

http://localhost:8983/solr/select?indent=onversion=2.2q=authors_primary%3A%22Russ%2CJon+R.%22start=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=

Result:

?xml version=1.0 encoding=UTF-8?
response

lst name=responseHeader
 int name=status0/int
 int name=QTime0/int
 lst name=params
  str name=explainOther/
  str name=fl*,score/str
  str name=indenton/str
  str name=start0/str
  str name=qauthors_primary:Russ,Jon R./str
  str name=hl.fl/
  str name=qtstandard/str
  str name=wtstandard/str
  str name=version2.2/str
  str name=rows10/str
 /lst
/lst
result name=response numFound=1 start=0 maxScore=6.4764633
 doc
  float name=score6.4764633/float
  str name=abstract[Asserts that the brass of line 3 of Sonnet 1 is a
metonym for cannon.]/str
  str name=accession_number /str
  str name=author_address /str
  str name=authors_primaryRuss,Jon R./str
  str name=authors_quaternaryAnonymous /str
  str name=authors_quinaryAnonymous /str
  str name=authors_secondaryAnonymous/str
  str name=authors_tertiaryAnonymous/str
  str name=availability /str
  str name=call_number /str
  str name=classification /str
  str name=data_source /str
  str name=database /str
  str name=doi /str
  str name=edition /str
  str name=id2eb453aab5101de037ba0747139ebd27/str
  str name=identifying_phrase /str
  str name=issn_isbn0014-4940/str
  str name=issue /str
  str name=keywordsEnglish literature;1500-1599;Shakespeare,
William;Sonnets/str
  str name=language /str
  str
name=linkshttp://search.epnet.com/login.aspx?direct=trueamp;db=mzhamp;an=1972103657/str
  str name=notesAccession Number: 1972103657. Peer Reviewed: Yes.
Publication Type: journal article. Language: English. Update Code: 197201.
Sequence No: 1972-1-3657. INDIVIDUAL WORKS - PLAYS; The Sonnets; Scholarship
and Criticism; Textual and Bibliographical Studies; metonymy/str
  str name=original_foreign_title /str
  str name=other_pages /str
  str name=periodical_abbrev /str
  str name=periodical_fullExplicator/str
  str name=place_of_publicationWashington, DC/str
  str name=pub_date_free_from /str
  str name=pub_year1972/str
  str name=publisher /str
  str name=reference_typeJournal/str
  str name=retrieved_date /str
  str name=shortened_title /str
  str name=start_pageItem 38/str
  str name=subfile_database /str
  str name=title_primaryShakespeare's Sonnet LXIV/str
  str name=title_secondary /str
  str name=title_tertiary /str
  str name=user_1 /str
  str name=user_2 /str
  str name=user_3 /str
  str name=user_4 /str
  str name=user_5 /str
  str name=volume30/str
 /doc
/result
/response

How do I get the full metadata listing when I do a query on the doctext
field.  I've attached my schema.xml file in case that can help someone
answser my question.

Thank you in advance!

Mike

http://www.nabble.com/file/p12531829/schema.xml schema.xml 

-- 
View this message in context: 
http://www.nabble.com/New-user-question%3A-How-to-show-all-stored-fields-in-a-result-tf4394773.html#a12531829
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Distribution Information?

2007-09-07 Thread Matthew Runo

OK. I made the change, but it seemed not to pick up the files.

When I changed distrobutiondump.jsp to say...

File masterdir = new File(/opt/solr/logs/clients);

it worked. Thank you for your help!

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Sep 7, 2007, at 2:21 PM, Bill Au wrote:

I just double checked distribution.jsp.  The directory where it  
looks for
status files is hard coded to logs/clients.  So for now  
master_status_dir in
your solr/conf/scripts.conf has to be set to that so the scripts  
will put

the status files there.  It looks like they are currently in you logs
directory.  The status files are snapshot.current.search2 and
snapshot.status.search2.

Bill

On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote:


Actually I don't have the clients directory...

[EMAIL PROTECTED]: .../logs]$ pwd
/opt/solr/logs
[EMAIL PROTECTED]: .../logs]$ ls
rsyncd-enabled  rsyncd.log  rsyncd.pid  snapcleaner.log
snapshooter.log  snapshot.current.search2  snapshot.status.search2
[EMAIL PROTECTED]: .../logs]$


It does look like it could be a path issue. I wonder why, though, no
clients sub directory was created.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 7, 2007, at 7:43 AM, Bill Au wrote:


I that case, definitely take a look at SOLR-333:

http://issues.apache.org/jira/browse/SOLR-333

On the master there should be a logs/clients directory.  Do you
have any
files in there?

Bill

On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:


Well, I do get...

Distribution Info
Master Server

No distribution info present

...

But there appears to be no information filled in.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 6, 2007, at 6:09 AM, Bill Au wrote:


That is very strange.  Even if there is something wrong with the
config or
code, the static HTML contained in distributiondump.jsp should  
show

up.

Are you using the latest version of the JSP?  There has been a
recent fix:

http://issues.apache.org/jira/browse/SOLR-333

Bill

On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:


When I load the distrobutiondump.jsp, there is no output in my
catalina.out file.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:


Not that I've noticed. I'll do a more careful grep soon here - I
just got back from a long weekend.

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Aug 31, 2007, at 6:12 PM, Bill Au wrote:


Are there any error message in your appserver log files?

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:

Hello!

/solr/admin/distributiondump.jsp

This server is set up as a master server, and other servers  
use

the
replication scripts to pull updates from it every few
minutes. My
distribution information screen is blank.. and I couldn't
find any
information on fixing this in the wiki.

Any chance someone would be able to explain how to get this  
page

working, or what I'm doing wrong?

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++