from:"J G"

Re: Filter multivalue fields from search result

2010-07-12 Thread Alex J. G. Burzyński


Hi,

So if those are separate documents how should I handle paging? Two 
separate queries?
First to return all matching courses-events pairs, and second one to get 
courses for given page?


Is this common design described in details somewhere?

Thanks,
Alex

On 2010-07-09 01:50, Lance Norskog wrote:

Yes, denormalizing the index into separate (name,town) pairs is the
common design for this problem.

2010/7/8 Alex J. G. Burzyńskimailing-s...@ajgb.net:
   

Hi,

Is it possible to remove from search results the multivalued fields that
don't pass the search criteria?

My schema is defined as:

!-- course_id --
field name=id type=string indexed=true stored=true
required=true /
!-- course_name --
field name=name type=string indexed=true stored=true/
!-- events.event_town --
field name=town type=string indexed=true stored=true
multiValued=true/
!-- events.event_date --
field name=date type=tdate indexed=true stored=true
multiValued=true/

And example docs are:

++--+++
| id | name | town   | date   |
++--+++
| 1  | Microsoft Excel  | London | 2010-08-20 |
||  | Glasgow| 2010-08-24 |
||  | Leeds  | 2010-08-28 |
| 2  | Microsoft Word   | Aberdeen   | 2010-08-21 |
||  | Reading| 2010-08-25 |
||  | London | 2010-08-29 |
| 2  | Microsoft Powerpoint | Birmingham | 2010-08-22 |
||  | Leeds  | 2010-08-26 |
++--+++

so the query for q=name:Microsoft town:Leeds returns docs 1  3.

How would I remove London/Glasgow from doc 1 and Birmingham from doc 3?

Or is it that I should create separate doc for each name-event?

Thanks,
Alex

Re: Filter multivalue fields from search result

2010-07-12 Thread Alex J. G. Burzyński


Hi Chantal,

The paging problem I've asked about is that having course-event pairs 
and specifying rows limits the number of pairs returned not the courses


+---+--+++
| id-id | name | town   | date   |
+---+--+++
| 1-1   | Microsoft Excel  | London | 2010-08-20 |
| 1-2   | Microsoft Excel  | Glasgow| 2010-08-24 |
| 1-3   | Microsoft Excel  | Leeds  | 2010-08-28 |
| 2-1   | Microsoft Word   | Aberdeen   | 2010-08-21 |
| 2-2   | Microsoft Word   | Reading| 2010-08-25 |
| 2-3   | Microsoft Word   | London | 2010-08-29 |
| 3-1   | Microsoft Powerpoint | Birmingham | 2010-08-22 |
| 3-2   | Microsoft Powerpoint | Leeds  | 2010-08-26 |
| 3-3   | Microsoft Powerpoint | Leeds  | 2010-08-30 |
+---+--+++


And from UI point of view I'm returning less courses then events - 
that's why I've asked about paging.


The search for q=name:Microsoft town:Leeds with rows=2 should return:
1-3  3-2  3-3

But 3-3 will be obviously on page 2.

I hope that it makes my questions more clear.

Thanks,
Alex


On 2010-07-12 10:26, Chantal Ackermann wrote:

Hi Alex,

I think you have to explain the complete use case. Paging is done by
specifying the parameter start (and rows if you want to have more or
less than 10 hits per page). For each page you need of course a new
query, but the queries differ only in the parameter value start (first
page start=0, second page start=10 etc. if rows=10). The other
parameters remain the same.

You should also have a look at facets. They might help you to get a list
of the values of your multi valued fields that you can display in the
UI, allowing the user to drill down the results further.

Chantal

On Mon, 2010-07-12 at 10:26 +0200, Alex J. G. Burzyński wrote:
   

Hi,

So if those are separate documents how should I handle paging? Two
separate queries?
First to return all matching courses-events pairs, and second one to get
courses for given page?

Is this common design described in details somewhere?

Thanks,
Alex

On 2010-07-09 01:50, Lance Norskog wrote:
 

Yes, denormalizing the index into separate (name,town) pairs is the
common design for this problem.

2010/7/8 Alex J. G. Burzyńskimailing-s...@ajgb.net:

   

Hi,

Is it possible to remove from search results the multivalued fields that
don't pass the search criteria?

My schema is defined as:

!-- course_id --
field name=id type=string indexed=true stored=true
required=true /
!-- course_name --
field name=name type=string indexed=true stored=true/
!-- events.event_town --
field name=town type=string indexed=true stored=true
multiValued=true/
!-- events.event_date --
field name=date type=tdate indexed=true stored=true
multiValued=true/

And example docs are:

++--+++
| id | name | town   | date   |
++--+++
| 1  | Microsoft Excel  | London | 2010-08-20 |
||  | Glasgow| 2010-08-24 |
||  | Leeds  | 2010-08-28 |
| 2  | Microsoft Word   | Aberdeen   | 2010-08-21 |
||  | Reading| 2010-08-25 |
||  | London | 2010-08-29 |
| 2  | Microsoft Powerpoint | Birmingham | 2010-08-22 |
||  | Leeds  | 2010-08-26 |
++--+++

so the query for q=name:Microsoft town:Leeds returns docs 1   3.

How would I remove London/Glasgow from doc 1 and Birmingham from doc 3?

Or is it that I should create separate doc for each name-event?

Thanks,
Alex

Filter multivalue fields from search result

2010-07-08 Thread Alex J. G. Burzyński

Hi,

Is it possible to remove from search results the multivalued fields that
don't pass the search criteria?

My schema is defined as:

!-- course_id --
field name=id type=string indexed=true stored=true
required=true /
!-- course_name --
field name=name type=string indexed=true stored=true/
!-- events.event_town --
field name=town type=string indexed=true stored=true
multiValued=true/
!-- events.event_date --
field name=date type=tdate indexed=true stored=true
multiValued=true/

And example docs are:

++--+++
| id | name | town   | date   |
++--+++
| 1  | Microsoft Excel  | London | 2010-08-20 |
||  | Glasgow| 2010-08-24 |
||  | Leeds  | 2010-08-28 |
| 2  | Microsoft Word   | Aberdeen   | 2010-08-21 |
||  | Reading| 2010-08-25 |
||  | London | 2010-08-29 |
| 2  | Microsoft Powerpoint | Birmingham | 2010-08-22 |
||  | Leeds  | 2010-08-26 |
++--+++

so the query for q=name:Microsoft town:Leeds returns docs 1  3.

How would I remove London/Glasgow from doc 1 and Birmingham from doc 3?

Or is it that I should create separate doc for each name-event?

Thanks,
Alex

Solr Spellcheck on Large index size

2010-04-27 Thread Kyle J G


I am trying to create a spell checker for my companies website.

Currently there are approx 29million documents in the index.

When trying to create the spelling index it just seems to skip over the
command.

My fields in schema.xml look like the following:

field name=ID type=int indexed=true stored=true required=true / 
field name=LineCode type=string indexed=true stored=true
required=true /
field name=PartNumber type=string indexed=true stored=true
required=true / 
field name=CategoryName type=string indexed=true stored=true
required=true / 
field name=PartTerminologyName type=string indexed=true
stored=true required=true / 
field name=Year type=int indexed=true stored=true 
required=true
/ 
field name=Make type=string indexed=true stored=true
required=true / 
field name=Model type=string indexed=true stored=true
required=true / 
field name=Submodel type=string indexed=true stored=true / 
field name=EngType type=string indexed=true stored=true
required=true / 
field name=Liter type=string indexed=true stored=true
required=true / 
field name=CC type=int indexed=true stored=true 
required=true / 
field name=CID type=int indexed=true stored=true 
required=true
/ 
field name=Fuel type=string indexed=true stored=true
required=true / 
field name=FuelDel type=string indexed=true stored=true
required=true / 
field name=Asp type=string indexed=true stored=true
required=true / 
field name=EngVin type=string indexed=true stored=true
required=true / 
field name=EngDesg type=string indexed=true stored=true
required=true / 

And copying fields as such: 
   copyField source=Year dest=text/
   copyField source=Make dest=text/
   copyField source=Model dest=text/
   copyField source=Fuel dest=text/
   copyField source=CategoryName dest=text/
copyField source=text dest=spell/

My spell checker config looks like the following: 

searchComponent name=spellcheck class=solr.SpellCheckComponent

!-- str name=queryAnalyzerFieldTypetextSpell/str --

lst name=spellchecker
  str name=namedefault/str
  str name=fieldspell/str
  str name=buildOnCommittrue/str
  str name=buildOnOptimizetrue/str
  str
name=spellcheckIndexDirC:\Users\kyleg\apache-solr-1.4.0\productGroups\solr\data\spellchecker/str
/lst

!-- a spellchecker that uses a different distance measure
lst name=spellchecker
  str name=namejarowinkler/str
  str name=fieldspell/str
  str
name=distanceMeasureorg.apache.lucene.search.spell.JaroWinklerDistance/str
  str name=spellcheckIndexDir./spellchecker2/str
/lst
 --

!-- a file based spell checker --
lst name=spellchecker
  str name=classnamesolr.FileBasedSpellChecker/str
  str name=namefile/str
  str name=sourceLocationspellings.txt/str
  str name=characterEncodingUTF-8/str
  str name=spellcheckIndexDir./spellcheckerFile/str
/lst
  /searchComponent


The command that I am sending to try to build looks like the following:
http://localhost:8983/solr/spell/?q=ACORAversion=2.2start=0rows=10indent=onspellcheck=truespellcheck.dictionary=defaultspellcheck.build=truespellcheck.collate=truespellcheck.limit=5


I have also tried to reduce the size of the index to around 10,000 documents
and still no luck.

Any help would be appreciated.

Thank you,
Kyle
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Spellcheck-on-Large-index-size-tp760416p760416.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Solr Replication

2009-08-27 Thread J G

We have multiple solr webapps all running from the same WAR file. Each webapp
is running under the same Tomcat container and I consider each webapp the same
thing as a slice (or instance). I've configured the Tomcat container to
enable JMX and when I connect using JConsole I only see the replication handler
for one of the webapps in the server. I was under the impression each webapp
gets its own replication handler. Is this not true?

It would be nice to be able to have a JMX MBean for each replication handler in
the container so we can get all the same replication information using JMX as
in using the replication admin page for each web app.

Thanks.

From: noble.p...@corp.aol.com
Date: Thu, 27 Aug 2009 13:04:38 +0530
Subject: Re: Solr Replication
To: solr-user@lucene.apache.org

when you say a slice you mean one instance of solr? So your JMX
console is connecting to only one solr?

On Thu, Aug 27, 2009 at 3:19 AM, J Gskinny_joe...@hotmail.com wrote:

Thanks for the response.

It's interesting because when I run jconsole all I can see is one
ReplicationHandler jmx mbean. It looks like it is defaulting to the first
slice it finds on its path. Is there anyway to have multiple replication
handlers or at least obtain replication on a per slice/instance via JMX
like how you can see attributes for each slice/instance via each
replication admin jsp page?

Thanks again.

From: noble.p...@corp.aol.com
Date: Wed, 26 Aug 2009 11:05:34 +0530
Subject: Re: Solr Replication
To: solr-user@lucene.apache.org

The ReplicationHandler is not enforced as a singleton , but for all
practical purposes it is a singleton for one core.

If an instance (a slice as you say) is setup as a repeater, It can
act as both a master and slave

in the repeater the configuration should be as follows

MASTER
|_SLAVE (I am a slave of MASTER)
|
REPEATER (I am a slave of MASTER and master to my slaves )
|
|
REPEATER_SLAVE( of REPEATER)

the point is that REPEATER will have a slave section has a masterUrl
which points to master and REPEATER_SLAVE will have a slave section
which has a masterurl pointing to repeater

On Wed, Aug 26, 2009 at 12:40 AM, J Gskinny_joe...@hotmail.com wrote:

Hello,

We are running multiple slices in our environment. I have enabled JMX
and I am inspecting the replication handler mbean to obtain some
information about the master/slave configuration for replication. Is the
replication handler mbean a singleton? I only see one mbean for the
entire server and it's picking an arbitrary slice to report on. So I'm
curious if every slice gets its own replication handler mbean? This is
important because I have no way of knowing in this specific server any
information about the other slices, in particular, information about the
master/slave value for the other slices.

Reading through the Solr 1.4 replication strategy, I saw that a slice
can be configured to be a master and a slave, i.e. a repeater. I'm
wondering how repeaters work because let's say I have a slice named 'A'
and the master is on server 1 and the slave is on server 2 then how are
these two servers communicating to replicate? Looking at the jmx
information I have in the MBean both the isSlave and isMaster is set to
true for my repeater so how does this solr slice know if it's the master
or slave? I'm a bit confused.

Thanks.

_
With Windows Live, you can organize, edit, and share your photos.
http://www.windowslive.com/Desktop/PhotoGallery

--
-
Noble Paul | Principal Engineer| AOL | http://aol.com

_
Hotmail® is up to 70% faster. Now good news travels really fast.
http://windowslive.com/online/hotmail?ocid=PID23391::T:WLMTAGL:ON:WL:en-US:WM_HYGN_faster:082009

--
-
Noble Paul | Principal Engineer| AOL | http://aol.com

_
With Windows Live, you can organize, edit, and share your photos.
http://www.windowslive.com/Desktop/PhotoGallery

RE: Solr Replication

2009-08-26 Thread J G

Thanks for the response.

It's interesting because when I run jconsole all I can see is one
ReplicationHandler jmx mbean. It looks like it is defaulting to the first slice
it finds on its path. Is there anyway to have multiple replication handlers or
at least obtain replication on a per slice/instance via JMX like how you
can see attributes for each slice/instance via each replication admin jsp
page?

Thanks again.

From: noble.p...@corp.aol.com
Date: Wed, 26 Aug 2009 11:05:34 +0530
Subject: Re: Solr Replication
To: solr-user@lucene.apache.org

The ReplicationHandler is not enforced as a singleton , but for all
practical purposes it is a singleton for one core.

If an instance (a slice as you say) is setup as a repeater, It can
act as both a master and slave

in the repeater the configuration should be as follows

MASTER
|_SLAVE (I am a slave of MASTER)
|
REPEATER (I am a slave of MASTER and master to my slaves )
|
|
REPEATER_SLAVE( of REPEATER)

the point is that REPEATER will have a slave section has a masterUrl
which points to master and REPEATER_SLAVE will have a slave section
which has a masterurl pointing to repeater

On Wed, Aug 26, 2009 at 12:40 AM, J Gskinny_joe...@hotmail.com wrote:

Hello,

We are running multiple slices in our environment. I have enabled JMX and I
am inspecting the replication handler mbean to obtain some information
about the master/slave configuration for replication. Is the replication
handler mbean a singleton? I only see one mbean for the entire server and
it's picking an arbitrary slice to report on. So I'm curious if every slice
gets its own replication handler mbean? This is important because I have no
way of knowing in this specific server any information about the other
slices, in particular, information about the master/slave value for the
other slices.

Reading through the Solr 1.4 replication strategy, I saw that a slice can
be configured to be a master and a slave, i.e. a repeater. I'm wondering
how repeaters work because let's say I have a slice named 'A' and the
master is on server 1 and the slave is on server 2 then how are these two
servers communicating to replicate? Looking at the jmx information I have
in the MBean both the isSlave and isMaster is set to true for my repeater
so how does this solr slice know if it's the master or slave? I'm a bit
confused.

Thanks.

_
With Windows Live, you can organize, edit, and share your photos.
http://www.windowslive.com/Desktop/PhotoGallery

--
-
Noble Paul | Principal Engineer| AOL | http://aol.com

_
Hotmail® is up to 70% faster. Now good news travels really fast.
http://windowslive.com/online/hotmail?ocid=PID23391::T:WLMTAGL:ON:WL:en-US:WM_HYGN_faster:082009

Solr Replication

2009-08-25 Thread J G


Hello,

We are running multiple slices in our environment. I have enabled JMX and I am 
inspecting the replication handler mbean to obtain some information about the 
master/slave configuration for replication. Is the replication handler mbean a 
singleton? I only see one mbean for the entire server and it's picking an 
arbitrary slice to report on. So I'm curious if every slice gets its own 
replication handler mbean? This is important because I have no way of knowing 
in this specific server any information about the other slices, in particular, 
information about the master/slave value for the other slices.

Reading through the Solr 1.4 replication strategy, I saw that a slice can be 
configured to be a master and a slave, i.e. a repeater. I'm wondering how 
repeaters work because let's say I have a slice named 'A' and the master is on 
server 1 and the slave is on server 2 then how are these two servers 
communicating to replicate? Looking at the jmx information I have in the MBean 
both the isSlave and isMaster is set to true for my repeater so how does this 
solr slice know if it's the master or slave? I'm a bit confused.

Thanks.




_
With Windows Live, you can organize, edit, and share your photos.
http://www.windowslive.com/Desktop/PhotoGallery

Obtaining SOLR index size on disk

2009-07-17 Thread J G


Hello,

Is it possible to obtain the SOLR index size on disk through the SOLR API? I've 
read through the docs and mailing list questions but can't seem to find the 
answer.

Any help is appreciated.

Thanks.



_
Hotmail® has ever-growing storage! Don’t worry about storage limits. 
http://windowslive.com/Tutorial/Hotmail/Storage?ocid=TXT_TAGLM_WL_HM_Tutorial_Storage_062009

JMX monitoring for multiple SOLR instances

2009-07-14 Thread J G


Hi,

If I want to run multiple SOLR war files in tomcat is it possible to monitor 
each of the SOLR instances individually through JMX? Has anyone attempted this 
before? Also, what are the implications (e.g. performance) of runnign mulitple 
SOLR instances in the same tomcat server?

Thanks.




_
Windows Live™: Keep your life in sync. 
http://windowslive.com/explore?ocid=TXT_TAGLM_WL_BR_life_in_synch_062009

solr jmx connection

2009-07-10 Thread J G


 Hello,

I have a SOLR JMX connection issue. I am running my JMX MBeanServer through 
Tomcat, meaning I am using Tomcat's MBeanServer rather than any other 
MBeanServer implemenation.
I am having a hard time trying to figure out the correct JMX Service URL on my 
localhost for the accessing the SOLR MBeans. My current configuration consists 
of the following:

JMX Service url = localhost:9000/jmxrmi

So I have configured JMX to run on port 9000 on tomcat on my localhost and 
using the above service url i can access the tomcat jmx MBeanServer and get 
related JVM object information(e.g. I can access the MemoryMXBean object)

However, I am having a harder time trying to access the SOLR MBeans. First, I 
could have the wrong service URL. Second, I'm confused as to which MBeans SOLR 
provides.

You might be asking why am I creating my own client rather than using JConsole, 
but JConsole doesn't provide the features I need.

Anyone with any knowledge or code snippets would be a huge help!

Thank you for your time!

Regards



_
Hotmail® has ever-growing storage! Don’t worry about storage limits. 
http://windowslive.com/Tutorial/Hotmail/Storage?ocid=TXT_TAGLM_WL_HM_Tutorial_Storage_062009

Re: Filter multivalue fields from search result

Re: Filter multivalue fields from search result

Filter multivalue fields from search result

Solr Spellcheck on Large index size

RE: Solr Replication

RE: Solr Replication

Solr Replication

Obtaining SOLR index size on disk

JMX monitoring for multiple SOLR instances

solr jmx connection

10 matches

Site Navigation

Mail list logo

Footer information