date:20100519


 Basically for some uses cases I would like to show
 duplicates for other I
 wanted them ignored.
 
 If I have overwriteDupes=false and I just create the dedup
 hash how can I
 query for only unique hash values... ie something like a
 SQL group by. 

TermsComponent maybe? 

or faceting? 
q=*:*facet=truefacet.field=signatureFielddefType=lucenerows=0start=0

if you append facet.mincount=1 to above url you can see your duplications

Re: Deduplication

 TermsComponent maybe? 
 
 or faceting?
 q=*:*facet=truefacet.field=signatureFielddefType=lucenerows=0start=0
 
 if you append facet.mincount=1 to above url you can
 see your duplications
 

After re-reading your message: sometimes you want to show duplicates, sometimes 
you don't want them. I have never used FieldCollapsing by myself but heard 
about it many times.

http://wiki.apache.org/solr/FieldCollapsing

Re: jmx issue with solr

2010-05-19 Thread Jean-Sebastien Vachon

Hi,

Try adding these options...

-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false


On 2010-05-19, at 3:44 AM, Na_D wrote:

 
 Hi,
 
 I am trying to start solr with the following command :
 
 java -Dsolr.solr.home=./example-DIH/solr/ -Dcom.sun.management.jmxremote
 -Dcom.sun.management.jmxremote.port=3000
 
 
 On doing so an error is reported :
 
 Error: Password file read access must be restricted: C:\Program
 Files\Java\jdk1.
 6.0_18\jre\lib\management\jmxremote.password
 
 
 The jmxremote.password file is there in the lib\management folder and the
 same has been set to read-only.
 still the error persists.I am using Windows XP SP3 Version 2002, just
 mentioning the same if its of any help.
 Please do put in your suggestions.
 -- 
 View this message in context: 
 http://lucene.472066.n3.nabble.com/jmx-issue-with-solr-tp828478p828478.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Storing RandomSortField

2010-05-19 Thread Alexandre Rocco

Leonardo,

I was able to use the feature with a dynamic field as pointed in the
documentation.
So, I was just curious to take a peek at the values that are generated, even
when the field is not dynamic, so I tried to figure out a way to do so.
Maybe some output when the debug query is enabled would be useful, but it
seems it's not implemented yet.
I will try to take a look at the classes and see what can I do about it.

Thanks!

On Wed, May 19, 2010 at 5:34 AM, Leonardo Menezes 
leonardo.menez...@googlemail.com wrote:

 Hey,
   for random sorting, random values are generated in runtime using the seed
 you passed as one of the parameters to generate the value, among other
 things. this way, if the value you use as seed is the same in different
 request, the sorting order should be the same. you could also, for debbuing
 purposes, edit the random sort field class and put some traces in there, so
 it could print the id of the document and the value generated for example.
 but the values wont be stored on the idx.

 cheers

 On Wed, May 19, 2010 at 10:00 AM, Marco Martinez 
 mmarti...@paradigmatecnologico.com wrote:

  Hi Alexandre,
 
  I am not totally sure about this, but the random sort field its only used
  to
  do a random sort on your searchs, and you will to pass differents values
 to
  have differents sorts, so this only applies in the searchs, so no value
 is
  indexed. You will find more information here:
 
 
 http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html
 
  Marco Martínez Bautista
  http://www.paradigmatecnologico.com
  Avenida de Europa, 26. Ática 5. 3ª Planta
  28224 Pozuelo de Alarcón
  Tel.: 91 352 59 42
 
 
  2010/5/18 Alexandre Rocco alel...@gmail.com
 
   Hi guys,
  
   Is there any way to mak a RandomSortField be stored?
   I'm trying to do it for debugging purposes,
   My intention is to take a look at the values that are stored there to
   determine the sorting that is being applied to the results.
  
   I tried to make it a stored field as:
   field name=randomorder type=random stored=true /
  
   And also tried to create another text field, copying the result from
 the
   random field like this:
   field name=randomorderdebug type=text indexed=true
 stored=true/
   copyField source=randomorder dest=randomorderdebug/
  
   Neither of the approaches worked.
   Is there any restriction on this kind of field that prevents it from
  being
   displayed in the results?
  
   Thanks,
   Alexandre

Re: jmx issue with solr

2010-05-19 Thread Na_D


Thanks for the info , using the above properties solved the issue .
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/jmx-issue-with-solr-tp828478p829057.html
Sent from the Solr - User mailing list archive at Nabble.com.

defaultSearchField

2010-05-19 Thread Antonello Mangone

Hi to everyone, I'd like to know if it's possible to use the *
defaultSearchField* on more fields ???

i.e.

defaultSearchField field1, field2, field3 /defaultSearchField


Thanks you all

Re: defaultSearchField

 Hi to everyone, I'd like to know if
 it's possible to use the *
 defaultSearchField* on more fields ???
 
 i.e.
 
 defaultSearchField field1, field2, field3
 /defaultSearchField
 

No. But you can query multiple fields using dismax. 

qf=field1,field2,field3defType=dismax

http://wiki.apache.org/solr/DisMaxRequestHandler

Re: defaultSearchField

2010-05-19 Thread Jan Kammer

There is something called dismax-requesthandler. I think this is what 
you are looking for.


greetz, Jan


Am 19.05.2010 15:47, schrieb Antonello Mangone:

Hi to everyone, I'd like to know if it's possible to use the *
defaultSearchField* on more fields ???

i.e.

defaultSearchField  field1, field2, field3/defaultSearchField


Thanks you all

Challenge: Searching for variant products and get basic products in result set


I do searching for products. Each base product exist in variants as well. One
variant has a glass door, another a steel door etc. The variants can have
diffent prices. The base product does not really exist, only the variants
exists IRL. The case corresponds to cars: the car model is the base product,
with color variants  or with automatic/manual etc.

I want to search for variants, but I only want to have base products in the
result. Ie when one or more variants from the same base product are found,
only the base product shall be in the search result.

Does somebody have an idea how this could be done?

Best regards

Henning
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH. behavior after a import. Log, delete table !?

2010-05-19 Thread stockii


hey, thx

i did all what you say.

createn an Jar-file. this jar file delete my table.

but SOLR absolute dont want to start this JAR. i put a run.bat file into my
folder where is my jar saved. this batch-file runs and delete the table, but
when solr start this batch-file. it doesnt work. i dont know why. !?!?!?
i test the batch-file in different wayy and it should be work... help ^^

windows xp for test ;-)
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-behavior-after-a-import-Log-delete-table-tp823232p829230.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Leonardo Menezes

would you then need to know in which variant was your match produced?
because if not, you can just index the whole thing as one single document...

On Wed, May 19, 2010 at 4:23 PM, hkmortensen ko...@yahoo.com wrote:


 I do searching for products. Each base product exist in variants as well.
 One
 variant has a glass door, another a steel door etc. The variants can have
 diffent prices. The base product does not really exist, only the variants
 exists IRL. The case corresponds to cars: the car model is the base
 product,
 with color variants  or with automatic/manual etc.

 I want to search for variants, but I only want to have base products in the
 result. Ie when one or more variants from the same base product are found,
 only the base product shall be in the search result.

 Does somebody have an idea how this could be done?

 Best regards

 Henning
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Embedded Server, Caching, Stats page updates

2010-05-19 Thread Antoniya Statelova


 The way you phrased that paragraph makes me think that one of us doesn't
 understand what exactly you did when you switched ...


Switched works for the specific setup i'm using - the server would refer
to itself in the CommonHttpSolrServer request sent, i.e. it would run both
the server and client sides. Removing this and simply using
EmbeddedSolrServer just made the setup a little more sane in that aspect.
Does that make more sense now?


 Now for starters: if the remote server you were running solr on is more
 powerful then the local machine you are running your java application on,
 that alone could explain some performance differences (likewise for JVM
 settings).

The machine I'm running it on is exactly the same - the code change was
pushed and I had performance before and after. Same load observed (since
it's a testing machine i could regulate that). That's why i was so surprised
that removing that additional http request didn't cause improvement.


 Most importantly: when running solr embedded in your application, there is
 no stats.jsp page for you to look at -- because solr is no longer
 running in a servlet container.  so if you are seeing stats on your
 solr server that say your caches aren't being hit, the reason is because
 the server isn't being hit at all.


This is nice to know, I didn't look into how the actual page was generated.
I expected something like this to be true. Thank you!


 When running an embedded solr server, the filterCache and queryResultCache
 will still be used.  the settings in the solrconfig.xml you specify when
 initializing the SolrCore will be honored.  you can see use JMX to monitor
 those cache hit rates (assuming you have JMX enabled for your application,
 and the appropriate setting is in your solrconfig.xml)

 I'll look into using JMX, thanks for the suggestion.

Tony

Re: Challenge: Searching for variant products and get basic products in result set

thanks. Currently not, but requirements change all the time as always ;-)
If we get a requirement, that a facet shall be material of doors, we will
need to know which variant was the hit. I would like to be prepared for
that.

Leonardo Menezes wrote:

would you then need to know in which variant was your match produced?
because if not, you can just index the whole thing as one single
document...

On Wed, May 19, 2010 at 4:23 PM, hkmortensen ko...@yahoo.com wrote:

I do searching for products. Each base product exist in variants as well.
One
variant has a glass door, another a steel door etc. The variants can have
diffent prices. The base product does not really exist, only the variants
exists IRL. The case corresponds to cars: the car model is the base
product,
with color variants or with automatic/manual etc.

I want to search for variants, but I only want to have base products in
the
result. Ie when one or more variants from the same base product are
found,
only the base product shall be in the search result.

Does somebody have an idea how this could be done?

Best regards

Henning
--
View this message in context:
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Leonardo Menezes

if that is so, and maybe, you have for example, two variants of cars with
automatic, what would define on which one was the hit? or field dont share
common information across variants? if they do share, you wouldnt be able to
define in which one was the hit(because it was on both of them) and would
either have to pick one randomly, or retrieve both. if they dont share that
info, you would have that covered, since only one would match any given
query.

On Wed, May 19, 2010 at 5:04 PM, hkmortensen ko...@yahoo.com wrote:

Leonardo Menezes wrote:

would you then need to know in which variant was your match produced?
because if not, you can just index the whole thing as one single
document...

On Wed, May 19, 2010 at 4:23 PM, hkmortensen ko...@yahoo.com wrote:

I do searching for products. Each base product exist in variants as
well.
One
variant has a glass door, another a steel door etc. The variants can
have
diffent prices. The base product does not really exist, only the
variants
exists IRL. The case corresponds to cars: the car model is the base
product,
with color variants or with automatic/manual etc.

Does somebody have an idea how this could be done?

Best regards

Henning
--
View this message in context:

http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Nagelberg, Kallin

I agree that pulling all attributes into the parent sku during indexing could 
work well. Define a Boolean field like 'isVirtual' to identify the non-leaf 
skus, and use a multi-valued field for each of the attributes. For now you can 
do a search like (isVirtual:true AND doorType:screen). If at a later date you 
want the actual variants just search for isVirtual:false.

Does that work?

-Kallin Nagelberg

-Original Message-
From: Leonardo Menezes [mailto:leonardo.menez...@googlemail.com] 
Sent: Wednesday, May 19, 2010 11:13 AM
To: solr-user@lucene.apache.org
Subject: Re: Challenge: Searching for variant products and get basic products 
in result set

if that is so, and maybe, you have for example, two variants of cars with
automatic, what would define on which one was the hit? or field dont share
common information across variants? if they do share, you wouldnt be able to
define in which one was the hit(because it was on both of them) and would
either have to pick one randomly, or retrieve both. if they dont share that
info, you would have that covered, since only one would match any given
query.

On Wed, May 19, 2010 at 5:04 PM, hkmortensen ko...@yahoo.com wrote:


 thanks. Currently not, but requirements change all the time as always ;-)
 If we get a requirement, that a facet shall be material of doors, we will
 need to know which variant was the hit. I would like to be prepared for
 that.




 Leonardo Menezes wrote:
 
  would you then need to know in which variant was your match produced?
  because if not, you can just index the whole thing as one single
  document...
 
  On Wed, May 19, 2010 at 4:23 PM, hkmortensen ko...@yahoo.com wrote:
 
 
  I do searching for products. Each base product exist in variants as
 well.
  One
  variant has a glass door, another a steel door etc. The variants can
 have
  diffent prices. The base product does not really exist, only the
 variants
  exists IRL. The case corresponds to cars: the car model is the base
  product,
  with color variants  or with automatic/manual etc.
 
  I want to search for variants, but I only want to have base products in
  the
  result. Ie when one or more variants from the same base product are
  found,
  only the base product shall be in the search result.
 
  Does somebody have an idea how this could be done?
 
  Best regards
 
  Henning
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Challenge: Searching for variant products and get basic products in result set

your are right, in that case an arbitrary on would have to be chosen or
probably then both should be in the result set. Difficult to say what the
marketing department would like ;-)

Leonardo Menezes wrote:

if that is so, and maybe, you have for example, two variants of cars with
automatic, what would define on which one was the hit? or field dont share
common information across variants? if they do share, you wouldnt be able
to
define in which one was the hit(because it was on both of them) and would
either have to pick one randomly, or retrieve both. if they dont share
that
info, you would have that covered, since only one would match any given
query.

On Wed, May 19, 2010 at 5:04 PM, hkmortensen ko...@yahoo.com wrote:

thanks. Currently not, but requirements change all the time as always ;-)
If we get a requirement, that a facet shall be material of doors, we
will
need to know which variant was the hit. I would like to be prepared for
that.

Leonardo Menezes wrote:

would you then need to know in which variant was your match produced?
because if not, you can just index the whole thing as one single
document...

On Wed, May 19, 2010 at 4:23 PM, hkmortensen ko...@yahoo.com wrote:

I do searching for products. Each base product exist in variants as
well.
One
variant has a glass door, another a steel door etc. The variants can
have
diffent prices. The base product does not really exist, only the
variants
exists IRL. The case corresponds to cars: the car model is the base
product,
with color variants or with automatic/manual etc.

I want to search for variants, but I only want to have base products
in
the
result. Ie when one or more variants from the same base product are
found,
only the base product shall be in the search result.

Does somebody have an idea how this could be done?

Best regards

Henning
--
View this message in context:

http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829413.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Challenge: Searching for variant products and get basic products in result set

sorry, what does sku mean?

I understand you like this: indexing base and variants, and include all
atributes (for one base and its variants) in each document. I think that
would work. Thanks.

Nagelberg, Kallin wrote:

I agree that pulling all attributes into the parent sku during indexing
could work well. Define a Boolean field like 'isVirtual' to identify the
non-leaf skus, and use a multi-valued field for each of the attributes.
For now you can do a search like (isVirtual:true AND doorType:screen). If
at a later date you want the actual variants just search for
isVirtual:false.

Does that work?

-Kallin Nagelberg

-Original Message-
From: Leonardo Menezes [mailto:leonardo.menez...@googlemail.com]
Sent: Wednesday, May 19, 2010 11:13 AM
To: solr-user@lucene.apache.org
Subject: Re: Challenge: Searching for variant products and get basic
products in result set

if that is so, and maybe, you have for example, two variants of cars with
automatic, what would define on which one was the hit? or field dont share
common information across variants? if they do share, you wouldnt be able
to
define in which one was the hit(because it was on both of them) and would
either have to pick one randomly, or retrieve both. if they dont share
that
info, you would have that covered, since only one would match any given
query.

On Wed, May 19, 2010 at 5:04 PM, hkmortensen ko...@yahoo.com wrote:

thanks. Currently not, but requirements change all the time as always ;-)
If we get a requirement, that a facet shall be material of doors, we
will
need to know which variant was the hit. I would like to be prepared for
that.

Leonardo Menezes wrote:

would you then need to know in which variant was your match produced?
because if not, you can just index the whole thing as one single
document...

On Wed, May 19, 2010 at 4:23 PM, hkmortensen ko...@yahoo.com wrote:

I do searching for products. Each base product exist in variants as
well.
One
variant has a glass door, another a steel door etc. The variants can
have
diffent prices. The base product does not really exist, only the
variants
exists IRL. The case corresponds to cars: the car model is the base
product,
with color variants or with automatic/manual etc.

I want to search for variants, but I only want to have base products
in
the
result. Ie when one or more variants from the same base product are
found,
only the base product shall be in the search result.

Does somebody have an idea how this could be done?

Best regards

Henning
--
View this message in context:

http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829435.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Nagelberg, Kallin

Sorry, in North America 'sku' (stock keeping unit) is the common term in 
business to specifically identify a particular product, 
http://lmgtfy.com/?q=sku. 

And yes, I think you understand me. I am imagining you can structure your 
products in a hierarchy. For each node in the tree you traverse all children, 
collecting their attributes into the current node.

-Kallin Nagelberg

-Original Message-
From: hkmortensen [mailto:ko...@yahoo.com] 
Sent: Wednesday, May 19, 2010 11:39 AM
To: solr-user@lucene.apache.org
Subject: RE: Challenge: Searching for variant products and get basic products 
in result set


sorry, what does sku mean?

I understand you like this: indexing base and variants, and include all
atributes (for one base and its variants) in each document. I think that
would work. Thanks.


Nagelberg, Kallin wrote:
 
 I agree that pulling all attributes into the parent sku during indexing
 could work well. Define a Boolean field like 'isVirtual' to identify the
 non-leaf skus, and use a multi-valued field for each of the attributes.
 For now you can do a search like (isVirtual:true AND doorType:screen). If
 at a later date you want the actual variants just search for
 isVirtual:false.
 
 Does that work?
 
 -Kallin Nagelberg
 
 -Original Message-
 From: Leonardo Menezes [mailto:leonardo.menez...@googlemail.com] 
 Sent: Wednesday, May 19, 2010 11:13 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Challenge: Searching for variant products and get basic
 products in result set
 
 if that is so, and maybe, you have for example, two variants of cars with
 automatic, what would define on which one was the hit? or field dont share
 common information across variants? if they do share, you wouldnt be able
 to
 define in which one was the hit(because it was on both of them) and would
 either have to pick one randomly, or retrieve both. if they dont share
 that
 info, you would have that covered, since only one would match any given
 query.
 
 On Wed, May 19, 2010 at 5:04 PM, hkmortensen ko...@yahoo.com wrote:
 

 thanks. Currently not, but requirements change all the time as always ;-)
 If we get a requirement, that a facet shall be material of doors, we
 will
 need to know which variant was the hit. I would like to be prepared for
 that.




 Leonardo Menezes wrote:
 
  would you then need to know in which variant was your match produced?
  because if not, you can just index the whole thing as one single
  document...
 
  On Wed, May 19, 2010 at 4:23 PM, hkmortensen ko...@yahoo.com wrote:
 
 
  I do searching for products. Each base product exist in variants as
 well.
  One
  variant has a glass door, another a steel door etc. The variants can
 have
  diffent prices. The base product does not really exist, only the
 variants
  exists IRL. The case corresponds to cars: the car model is the base
  product,
  with color variants  or with automatic/manual etc.
 
  I want to search for variants, but I only want to have base products
 in
  the
  result. Ie when one or more variants from the same base product are
  found,
  only the base product shall be in the search result.
 
  Does somebody have an idea how this could be done?
 
  Best regards
 
  Henning
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
 Sent from the Solr - User mailing list archive at Nabble.com.

 
 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829435.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH. behavior after a import. Log, delete table !?

 createn an Jar-file. this jar file delete my table.
 
 but SOLR absolute dont want to start this JAR. i put a
 run.bat file into my
 folder where is my jar saved. this batch-file runs and
 delete the table, but
 when solr start this batch-file. it doesnt work. i dont
 know why. !?!?!?
 i test the batch-file in different wayy and it should be
 work... help ^^
 
 windows xp for test ;-)

I don't know why but, it seems that we need to set dir other than '.'
Anyway I got it working in Windows in two ways:

1-)
updateHandler class=solr.DirectUpdateHandler2
 listener event=postCommit class=solr.RunExecutableListener 
  str name=exejava/str 
  str name=dirsolr/bin/str 
  arr name=args str-jar/str strjunk.jar/str /arr
  bool name=waittrue/bool 
 /listener 
/updateHandler

2-) Giving full paths:

updateHandler class=solr.DirectUpdateHandler2
 listener event=postCommit class=solr.RunExecutableListener 
  str name=exeC:\test.bat/str 
  str name=dirC:\/str 
  bool name=waittrue/bool 
/listener 
/updateHandler

It should work this time on windows.

RE: Challenge: Searching for variant products and get basic products in result set

yes I think that will make a good solution. In Dänish sku is a bad word
;-), but thanks for the info.

Nagelberg, Kallin wrote:

Sorry, in North America 'sku' (stock keeping unit) is the common term in
business to specifically identify a particular product,
http://lmgtfy.com/?q=sku.

And yes, I think you understand me. I am imagining you can structure your
products in a hierarchy. For each node in the tree you traverse all
children, collecting their attributes into the current node.

-Kallin Nagelberg

-Original Message-
From: hkmortensen [mailto:ko...@yahoo.com]
Sent: Wednesday, May 19, 2010 11:39 AM
To: solr-user@lucene.apache.org
Subject: RE: Challenge: Searching for variant products and get basic
products in result set

sorry, what does sku mean?

I understand you like this: indexing base and variants, and include all
atributes (for one base and its variants) in each document. I think that
would work. Thanks.

Nagelberg, Kallin wrote:

Does that work?

-Kallin Nagelberg

if that is so, and maybe, you have for example, two variants of cars with
automatic, what would define on which one was the hit? or field dont
share
common information across variants? if they do share, you wouldnt be able
to
define in which one was the hit(because it was on both of them) and would
either have to pick one randomly, or retrieve both. if they dont share
that
info, you would have that covered, since only one would match any given
query.

On Wed, May 19, 2010 at 5:04 PM, hkmortensen ko...@yahoo.com wrote:

thanks. Currently not, but requirements change all the time as always
;-)
If we get a requirement, that a facet shall be material of doors, we
will
need to know which variant was the hit. I would like to be prepared for
that.

Leonardo Menezes wrote:

would you then need to know in which variant was your match produced?
because if not, you can just index the whole thing as one single
document...

On Wed, May 19, 2010 at 4:23 PM, hkmortensen ko...@yahoo.com wrote:

I do searching for products. Each base product exist in variants as
well.
One
variant has a glass door, another a steel door etc. The variants can
have
diffent prices. The base product does not really exist, only the
variants
exists IRL. The case corresponds to cars: the car model is the base
product,
with color variants or with automatic/manual etc.

I want to search for variants, but I only want to have base products
in
the
result. Ie when one or more variants from the same base product are
found,
only the base product shall be in the search result.

Does somebody have an idea how this could be done?

Best regards

Henning
--
View this message in context:

http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829435.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829530.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr Delta Queries

2010-05-19 Thread Vladimir Sutskever

I have a indexed_timestamp field  in my index - which lets me know when 
document was indexed:

field name=indexed_timestamp type=date indexed=true stored=true 
default=NOW multiValued=false/


For some reason when doing delta indexing via DIH, this field is not being 
updated.

Are timestamp fields updated during DELTA updates?



Kind regards,

Vladimir Sutskever
Investment Bank - Technology
JPMorgan Chase, Inc.



This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.

Re: index merge

2010-05-19 Thread uma m


Hi All,

  I am running solr in 64 bit HP-UX system. The total index size is about
5GB and when i try load any new document, solr tries to merge the existing
segments first and results in following error. I could see a temp file is
growng within index dir around 2GB in size and later it fails with this
exception. It looks like, by reaching Integer.MAXVALUE, the exception
occurs.

Exception in thread Lucene Merge Thread #0
org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
File too large (errno:27)
at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:351)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:315)
Caused by: java.io.IOException: File too large (errno:27)
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:456)
at
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
at
org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
at
org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
at
org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109)
at
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.close(SimpleFSDirectory.java:199)
at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:144)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:357)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5029)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4614)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)

---

The solrconfig.xml contains default values for indexDefaults, mainIndex
sections as below.

  indexDefaults^M
   !-- Values here affect all index writers and act as a default unless
overridden. --^M
useCompoundFilefalse/useCompoundFile^M
^M
mergeFactor10/mergeFactor^M
!-- If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene
will flush^M
 based on whichever limit is hit first.  --^M
!--maxBufferedDocs1000/maxBufferedDocs--^M
^M
!-- Sets the amount of RAM that may be used by Lucene indexing^M
  for buffering added documents and deletions before they are^M
  flushed to the Directory.  --^M
ramBufferSizeMB32/ramBufferSizeMB^M
!-- maxMergeDocs2147483647/maxMergeDocs --^M
maxFieldLength1/maxFieldLength^M
writeLockTimeout1000/writeLockTimeout^M
commitLockTimeout1/commitLockTimeout^M
 !--mergePolicy
class=org.apache.lucene.index.LogByteSizeMergePolicy/--^M
!--mergeScheduler
class=org.apache.lucene.index.ConcurrentMergeScheduler/--^M
  /indexDefaults^
 mainIndex^M
!-- options specific to the main on-disk lucene index --^M
useCompoundFilefalse/useCompoundFile^M
ramBufferSizeMB32/ramBufferSizeMB^M
mergeFactor10/mergeFactor^M
!-- Deprecated --^M
!--maxBufferedDocs1000/maxBufferedDocs--^M
!--maxMergeDocs2147483647/maxMergeDocs--^M
 /mainIndex^


Could anyone help me to resolve this exception?

Regards,
Uma
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/index-merge-tp472904p829810.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: index merge

 I am running solr in 64 bit HP-UX system. The total
 index size is about
 5GB and when i try load any new document, solr tries to
 merge the existing
 segments first and results in following error. I could see
 a temp file is
 growng within index dir around 2GB in size and later it
 fails with this
 exception. It looks like, by reaching Integer.MAXVALUE, the
 exception
 occurs.

ramBufferSizeMB32/ramBufferSizeMB isn't 32MB ramBufferSizeMB too small?

The Seven Deadly Sins of Solr spanish translation

2010-05-19 Thread Juan Pedro Danculovic

Hello, I translate this article into Spanish. It is very helpful to avoid
common mistakes in solr installations.

http://www.linebee.com/?p=434lang=es

Thanks,

Juan

Re: Custom sorting

2010-05-19 Thread Daniel Cassiano

Hi Dan,

It seems that you want a SearchComponent[1], something like the
QueryElevationComponent[2].
Take a look how at him and I think you can build your custom solution.

[1]-
http://lucene.apache.org/solr/api/org/apache/solr/handler/component/SearchComponent.html
[2]- http://wiki.apache.org/solr/QueryElevationComponent


Cheers,

-- Daniel Cassiano

http://dcassiano.wordpress.com


On Wed, May 19, 2010 at 6:46 AM, dan sutton danbsut...@gmail.com wrote:

 Hi,

 I have a requirement to do the following:

 For up to the first 10 results (i.e. only on the first page) show
 sponsored category ads, in order of bid, but no more than 2 / category,
 and only if all sponsored cat' ads are more that min% of the highest
 score. e.g. If I had the following:

 min% =1


 doc score bid  cat_id sponsored
  1   100   x   x 0
  255x   x 0

  3502   2 1
  4202   2 1
  5052   2 1

  6801   1 1
  7701   1 1
  8601   1 1

 x = dont care

 sorted order would be:

 3
 4

 6
 7

 1
 8
 2
 5

 I'm not sure if this can be implemented with a custom comparator as I
 need access to the final score to enforce min%, I'm thinking I'm
 probably going to have to implement a subclass of QParserPlugin with a
 custom sort. but was wondering if there were alternatives ?

 Many thanks in advance.
 Dan

Re: disable caches in real time


: I've always undestand that if you do a commit (replication does it), a new
: searcher is open, and you lose performance (queries per second) while the
: caches are regenerated. I think i don't explain correctly my situation

not if you configure your caches with autowarming -- then solr will warm 
up the new caches (on the new index) while the old index still serves 
requests -- this is all manged for you by the SolrCore, no need for core 
swapping.


-Hoss

RE: disable caches in real time

2010-05-19 Thread Nagelberg, Kallin

I suppose you are still losing some performance on the replicated box since it 
needs to use some resources to warm the cache. It would be nice if a warmed 
cache could be replicated from the master though perhaps that's not practical. 
Chris is right though: The newly updated index created by a commit is not seen 
by users until it has been warmed, at which point it is atomically swapped.

-Kallin Nagelberg



-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Wednesday, May 19, 2010 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: disable caches in real time


: I've always undestand that if you do a commit (replication does it), a new
: searcher is open, and you lose performance (queries per second) while the
: caches are regenerated. I think i don't explain correctly my situation

not if you configure your caches with autowarming -- then solr will warm 
up the new caches (on the new index) while the old index still serves 
requests -- this is all manged for you by the SolrCore, no need for core 
swapping.


-Hoss

Stemming Filters in wiki

2010-05-19 Thread Asif Rahman

I see that the entries for PorterStemFilterFactory,
EnglishPorterFilterFactory, and SnowballPorterFilterFactory have been
removed from the Analyzers, Tokenizers, and Token Filters wiki page.  Is
there a reason for this?

Thanks,

asif


-- 
Asif Rahman
Lead Engineer - NewsCred
a...@newscred.com
http://platform.newscred.com

Re: Embedded Server, Caching, Stats page updates


: Switched works for the specific setup i'm using - the server would refer
: to itself in the CommonHttpSolrServer request sent, i.e. it would run both
: the server and client sides. Removing this and simply using
: EmbeddedSolrServer just made the setup a little more sane in that aspect.
: Does that make more sense now?

not really ... what *exactly* did you change about your setup and 
your client code?  please be specific -- how did you run solr
before when you were using CommonsHttpSolrServer? whare are *all* of the 
steps you did when you switched to EmbeddedSolrServer (specificly: what 
did the changes to your java client code look like, and what did you 
hcange about how you run solr)

Because if you still have the solr.war running in your servlet container, 
and all you did is edit your java code to use EmbeddedSolrServer (poiting 
at the same directory on disk) instead of COmmonsHttpSolrServer, thne you 
are now running *two* instances of Solr in your VM, both reading from the 
same indexes.


-Hoss

Re: Stemming Filters in wiki

2010-05-19 Thread Robert Muir

Hi Asif,

These entries were moved here: http://wiki.apache.org/solr/LanguageAnalysis

On Wed, May 19, 2010 at 2:49 PM, Asif Rahman a...@newscred.com wrote:
 I see that the entries for PorterStemFilterFactory,
 EnglishPorterFilterFactory, and SnowballPorterFilterFactory have been
 removed from the Analyzers, Tokenizers, and Token Filters wiki page.  Is
 there a reason for this?

 Thanks,

 asif


 --
 Asif Rahman
 Lead Engineer - NewsCred
 a...@newscred.com
 http://platform.newscred.com




-- 
Robert Muir
rcm...@gmail.com

Re: Moving from Lucene to Solr?


: Subject: Moving from Lucene to Solr?
: References: aanlktimxy1wscs_bjzkkkdy7dlrw1iober5kzszrf...@mail.gmail.com
: In-Reply-To: aanlktimxy1wscs_bjzkkkdy7dlrw1iober5kzszrf...@mail.gmail.com

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is hidden in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking



-Hoss

Re: Stemming Filters in wiki