date:20100625

Re: SOLR-236 Patch

2010-06-25 Thread Martijn v Groningen

Hi Sam,

It seems that the patch is out of sync again with the trunk. Can you
try patching with revision 955615? I'll update the patch shortly.

Martijn

On 24 June 2010 09:49, Amdebirhan, Samson, VF-Group
samson.amdebir...@vodafone.com wrote:
 Hi



 Trying to apply the SOLR-236 patch to a trunk i get what follows. Can
 anyone help me understanding what I am missing ?



 
 .

 svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunk



 patch -p0 -i SOLR-236-trunk.patch --dry-run



 patching file
 solr/src/test/org/apache/solr/search/fieldcollapse/MyDocTermsIndex.java

 patching file
 solr/src/java/org/apache/solr/handler/component/CollapseComponent.java

 patching file
 solr/src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml

 patching file
 solr/src/java/org/apache/solr/search/fieldcollapse/collector/FieldValueC
 ountCollapseCollectorFactory.java

 patching file
 solr/src/java/org/apache/solr/search/fieldcollapse/collector/DocumentGro
 upCountCollapseCollectorFactory.java

 can't find file to patch at input line 1068

 Perhaps you used the wrong -p or --strip option?

 The text leading up to this was:

 --

 |Index: solr/src/java/org/apache/solr/search/DocSetHitCollector.java

 |===

 |--- solr/src/java/org/apache/solr/search/DocSetHitCollector.java
 (revision 922957)

 |+++ solr/src/java/org/apache/solr/search/DocSetHitCollector.java
 (revision )



 
 .







 Regards

 Sam

XML DataImportHandler copy + rezise pictures in localhost?

2010-06-25 Thread scrapy


 

 Hi,

I'm adding documents to Solr via XML files and DataImportHandler.

In the XML file i've got some product picture links:


picture
picture_urlhttp://www.example.com/pic.jpg/picture_url
/picture

I would like to keep a local thumb of these picture in local server in order to 
avoid long external loading time.

Example:

Original picture: 
http://www.example.com/pic.jpg is 800x600px

== conversion

Local picture:

http://localhost/pic.jpg in 100x100px



Is there a way to do this?

Thanks for your help.

Marc

Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-25 Thread Chantal Ackermann

Hi Mitch,

thanks for the answer and the link.

The use case is to provide content based recommendations for a single
item no matter where that came from. So, this input (match) item is the
best match, all more like this items compare to it, and the ones that
are the most alike would have the highest scores.

(Meaning also that the most similar are probably not as good as
recommendations because they are too similar. But that is a different
story.)

Again, I don't want to compare the scores of regular search results
(e.g. from dismax) with those of mlt. I only want a way to show to the
user a kind of relevancy or similarity indicator (for example using a
range of 10 stars) that would give a hint on how similar the mlt hit is
to the input (match) item.

Greetings from Munich ;-)
Chantal

On Thu, 2010-06-24 at 17:06 +0200, MitchK wrote:
Chantal,

have a look at
http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/similar/MoreLikeThis.html
More like this to have a guess what the MLT's score concerns.

The problem is that you can't compare scores.
The query for the normal result-response was maybe something like
Bill Gates featuring Linus Torvald - The perfect OS song.
The user picks now one of the responsed documents and says he wants More
like this - maybe, because the concerned topic was okay, but the content
was not enough or whatever...
But the sent query is totaly different (as you can see in the link) - so
that would be like comparing apples and oranges, since they do not use the
same base.

What would be the use case? Why is score-normalization needed?

Kind regards from Germany,
- Mitch

Re: performance sorting multivalued field

2010-06-25 Thread Marc Sturlese


*There are lot's of docs with the same value, I mention that because I
supose that same value has nothing to do with the number of un-inverted term
instances.
It has to do, I've been able to reproduce teh error by setting different
values to each field:

HTTP Status 500 - there are more terms than documents in field date, but
it's impossible to sort on tokenized fields java.lang.RuntimeException:
there are more terms than documents in field id, but it's impossible to
sort on tokenized fields at
org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:706)...
 

But, it's already fixed for Lucene 2.9.4, 3.0.3, 3.1, 4.0 versions:
https://issues.apache.org/jira/browse/LUCENE-2142
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/performance-sorting-multivalued-field-tp905943p921752.html
Sent from the Solr - User mailing list archive at Nabble.com.

DIH - $deleteDocById

2010-06-25 Thread Ingar Hov

I seem to have a hard time to get $deleteDocById to work with the
XPathEntityProcessor. Anyone tested it and got it to work?

Here's a snippet of the code:
--
..
field column=id xpath=/io/article/@id/
field column=source xpath=/io/article/secti...@homesection='yes']/@source/
..
field column=unique_id template=${document.source}_${document.id}/
field column=$deleteDocById regex=^^(published)$
repaceWith=${document.unique_id} sourceColName=state/
..

Whenever I try to run a delta-import with a document that should
delete from the index it only updates the document in the index. The
last line in the code above is based upon a tip I found on the net,
unsure if it's correct.  Any help would be appreciated.

Regards,
Ingar

Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-25 Thread MitchK


Hi Chantal,

Munich? Germany seems to be soo small :-).


Chantal Ackermann wrote:
 
 I only want a way to show to the 
 user a kind of relevancy or similarity indicator (for example using a 
 range of 10 stars) that would give a hint on how similar the mlt hit is 
 to the input (match) item. 
 
Okay, that's making more sense.
Unfortunately, you can not do that with Lucene with results that might fit
your needs (as far as I know).

Kind regards
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/MoreLikeThis-mlt-use-the-match-s-maxScore-for-result-score-normalization-tp919598p921942.html
Sent from the Solr - User mailing list archive at Nabble.com.

[ANN] Solr 1.4.1 Released

2010-06-25 Thread Mark Miller

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Apache Solr 1.4.1 has been released and is now available for public
download!
http://www.apache.org/dyn/closer.cgi/lucene/solr/

Solr is the popular, blazing fast open source enterprise search
platform from the Apache Lucene project.  Its major features include
powerful full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, and rich document (e.g., Word, PDF)
handling.  Solr is highly scalable, providing distributed search and
index replication, and it powers the search and navigation features of
many of the world's largest internet sites.

Solr is written in Java and runs as a standalone full-text search server
within a servlet container such as Tomcat.  Solr uses the Lucene Java
search library at its core for full-text indexing and search, and has
REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
any programming language.  Solr's powerful external configuration allows
it to be tailored to almost any type of application without Java coding,
and it has an extensive plugin architecture when more advanced
customization is required.

Solr 1.4.1 is a bug fix release for Solr 1.4 that includes many Solr bug
fixes as well as Lucene bug fixes from Lucene 2.9.3.

See all of the CHANGES here:
http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/CHANGES.txt


- - Mark Miller on behalf of the Solr team
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJMJK3AAAoJED+/0YJ4eWrIrfAP/RLD7QvreOBFebICN/eiRzCH
1dHOt9Scn7qGQU4RvXZ8GQq37AuoRMgmgckntttFLCCD5w5A29/GxzyZbAoQDQ0B
OkaHsYIcUuhbLq8QtlTjt+rK3gc6oxMoCRMJBS7DfUFUyROl6om4gpYAVem50qDy
FfBdgRxp4VZ07E7VwmMvma03nSrKuvX0bwE8NXksaCAVsvkmi8Sh7aLMPPVHgsuD
pbY8kB0hXCULJgs9ZAc2t6+T38+eV9wxJSeAktVlGAvNlYTavW2bxzF5wQk+kXCd
DwGjdlU9/ebHdx3MHJyE0zXSl4rGFsy8zfh/ntk7UV7qklQ2jn5Ur18zLqv4vkb1
Ea78GpoqCZWlMGcRUSErtH33cGs4blo/kuJZj/VLrk6jxO4x4beUsAfRcM/YliJW
Z6OuFtpcdVDjVl4aB2xbAMwDl2DXqgyNmlxs8vvqdRoDhN8wZ91raO0kkbrkzj1f
5gPD//Efx6RcrYtXAV3HKAwI7FLP8MhzFu1Y2FK2FY7DyFNmirad03+pB6bFs1xq
ARU6pdeTYvv+PsWH3Keaw/L/nb0BYbU8R1sVhkvjm+S9gJ6cCcKJkeAkNgL+6QNm
JPJ5VeXVFGVmwzQ5mE3j6qX1uDrEmLA2T5Dd7bssWtwveLoyfo0s7qezIfbRamnc
T3iyCE6cuSU9CvCEqN+o
=nBB9
-END PGP SIGNATURE-

Re: Recommended MySQL JDBC driver

2010-06-25 Thread Lukas Kahwe Smith


On 18.05.2010, at 17:22, Shawn Heisey wrote:

 On 5/14/2010 12:40 PM, Shawn Heisey wrote:
 I downgraded to 5.0.8 for testing. Initially, I thought it was going to be 
 faster, but it slows down as it gets further into the index.  It now looks 
 like it's probably going to take the same amount of time.
 
 On the server timeout thing - that's a setting you'd have to put in my.ini 
 or my.cfg, there may also be a way to change it on the fly without 
 restarting the server.  I suspect that when you are running a multiple query 
 setup like yours, it opens multiple connections, and when one of them is 
 busy doing some work, the others are idle.  That may be related to the 
 timeout with the older connector version.  On my setup, I only have one 
 query that retrieves records, so I'm probably not going to run into that.  I 
 could be wrong about how it works - you can confirm or refute this idea by 
 looking at SHOW PROCESSLIST on your MySQL server while it's working.
 
 I was having no trouble with the 5.0.8 connector on 1.5-dev build 922440M, 
 but then I upgraded the test machine to the latest 4.0 from trunk, and ran 
 into the timeout issue you described, so I am going back to the 5.1.12 
 connector.  I just saw the message on the list about branch_3x in SVN, which 
 looks like a better option than trunk.


Any news on this topic?

regards,
Lukas Kahwe Smith
m...@pooteeweet.org

Re: [ANN] Solr 1.4.1 Released

2010-06-25 Thread Stevo Slavić

Congrats on the release!

Something seems to be wrong with solr 1.4.1 maven artifacts, there is in
extra solr in the path. E.g. solr-parent-1.4.1.pom at in
http://repo1.maven.org/maven2/org/apache/solr/solr/solr-parent/1.4.1/solr-parent-1.4.1.pomwhile
it should be at
http://repo1.maven.org/maven2/org/apache/solr/solr-parent/1.4.1/solr-parent-1.4.1.pom.
Pom's seem to contain correct maven artifact coordinates.

Regards,
Stevo.

On Fri, Jun 25, 2010 at 3:23 PM, Mark Miller markrmil...@apache.org wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Apache Solr 1.4.1 has been released and is now available for public
 download!
 http://www.apache.org/dyn/closer.cgi/lucene/solr/

 Solr is the popular, blazing fast open source enterprise search
 platform from the Apache Lucene project.  Its major features include
 powerful full-text search, hit highlighting, faceted search, dynamic
 clustering, database integration, and rich document (e.g., Word, PDF)
 handling.  Solr is highly scalable, providing distributed search and
 index replication, and it powers the search and navigation features of
 many of the world's largest internet sites.

 Solr is written in Java and runs as a standalone full-text search server
 within a servlet container such as Tomcat.  Solr uses the Lucene Java
 search library at its core for full-text indexing and search, and has
 REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
 any programming language.  Solr's powerful external configuration allows
 it to be tailored to almost any type of application without Java coding,
 and it has an extensive plugin architecture when more advanced
 customization is required.

 Solr 1.4.1 is a bug fix release for Solr 1.4 that includes many Solr bug
 fixes as well as Lucene bug fixes from Lucene 2.9.3.

 See all of the CHANGES here:
 http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/CHANGES.txt


 - - Mark Miller on behalf of the Solr team
 -BEGIN PGP SIGNATURE-
 Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

 iQIcBAEBAgAGBQJMJK3AAAoJED+/0YJ4eWrIrfAP/RLD7QvreOBFebICN/eiRzCH
 1dHOt9Scn7qGQU4RvXZ8GQq37AuoRMgmgckntttFLCCD5w5A29/GxzyZbAoQDQ0B
 OkaHsYIcUuhbLq8QtlTjt+rK3gc6oxMoCRMJBS7DfUFUyROl6om4gpYAVem50qDy
 FfBdgRxp4VZ07E7VwmMvma03nSrKuvX0bwE8NXksaCAVsvkmi8Sh7aLMPPVHgsuD
 pbY8kB0hXCULJgs9ZAc2t6+T38+eV9wxJSeAktVlGAvNlYTavW2bxzF5wQk+kXCd
 DwGjdlU9/ebHdx3MHJyE0zXSl4rGFsy8zfh/ntk7UV7qklQ2jn5Ur18zLqv4vkb1
 Ea78GpoqCZWlMGcRUSErtH33cGs4blo/kuJZj/VLrk6jxO4x4beUsAfRcM/YliJW
 Z6OuFtpcdVDjVl4aB2xbAMwDl2DXqgyNmlxs8vvqdRoDhN8wZ91raO0kkbrkzj1f
 5gPD//Efx6RcrYtXAV3HKAwI7FLP8MhzFu1Y2FK2FY7DyFNmirad03+pB6bFs1xq
 ARU6pdeTYvv+PsWH3Keaw/L/nb0BYbU8R1sVhkvjm+S9gJ6cCcKJkeAkNgL+6QNm
 JPJ5VeXVFGVmwzQ5mE3j6qX1uDrEmLA2T5Dd7bssWtwveLoyfo0s7qezIfbRamnc
 T3iyCE6cuSU9CvCEqN+o
 =nBB9
 -END PGP SIGNATURE-

Re: [ANN] Solr 1.4.1 Released

2010-06-25 Thread Mark Miller

Can a solr/maven dude look at this? I simply used the copy command on
the release to-do wiki (sounds like it should be updated).

If no one steps up, I'll try and straighten it out later.

On 6/25/10 10:28 AM, Stevo Slavić wrote:
Congrats on the release!

Something seems to be wrong with solr 1.4.1 maven artifacts, there is in
extra solr in the path. E.g. solr-parent-1.4.1.pom at in
http://repo1.maven.org/maven2/org/apache/solr/solr/solr-parent/1.4.1/solr-parent-1.4.1.pomwhile
it should be at
http://repo1.maven.org/maven2/org/apache/solr/solr-parent/1.4.1/solr-parent-1.4.1.pom.
Pom's seem to contain correct maven artifact coordinates.

Regards,
Stevo.

On Fri, Jun 25, 2010 at 3:23 PM, Mark Miller markrmil...@apache.org wrote:

Apache Solr 1.4.1 has been released and is now available for public
download!
http://www.apache.org/dyn/closer.cgi/lucene/solr/

Solr is the popular, blazing fast open source enterprise search
platform from the Apache Lucene project. Its major features include
powerful full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, and rich document (e.g., Word, PDF)
handling. Solr is highly scalable, providing distributed search and
index replication, and it powers the search and navigation features of
many of the world's largest internet sites.

Solr is written in Java and runs as a standalone full-text search server
within a servlet container such as Tomcat. Solr uses the Lucene Java
search library at its core for full-text indexing and search, and has
REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
any programming language. Solr's powerful external configuration allows
it to be tailored to almost any type of application without Java coding,
and it has an extensive plugin architecture when more advanced
customization is required.

Solr 1.4.1 is a bug fix release for Solr 1.4 that includes many Solr bug
fixes as well as Lucene bug fixes from Lucene 2.9.3.

See all of the CHANGES here:
http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.1/CHANGES.txt

- Mark Miller on behalf of the Solr team

Re: SweetSpotSimilarity

2010-06-25 Thread Ahmet Arslan

 Would someone mind explaining how this differs from the
 DefaultSimilarity?

The difference is length normalization. Default one punishes long documents.

Sweet one computes to a constant norm for all lengths in
the [min,max] range (the sweet spot), and smaller norm
values for lengths out of this range. Documents shorter or
longer than the sweet spot range are punished

Section 4.1 http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf

 Also how would one replace the use of the DefaultSimilarity
 class with this
 one? I can't seem to find any such configuration in
 solrconfig.xml.


it is in schema.xml:

similarity class=org.apache.lucene.search.SweetSpotSimilarity/

Debugging Queries

2010-06-25 Thread Frank A

I have a query that is not returning the results I expect - as in there are
missing results.  Is there a way given an ID to the index field to dive into
how the entity is stored in the index?

Thanks.

Re: SweetSpotSimilarity

2010-06-25 Thread Blargy



iorixxx wrote:
 
 it is in schema.xml:
 
 similarity class=org.apache.lucene.search.SweetSpotSimilarity/
 

Thanks. Im guessing this is all or nothing.. ie you can't you one similarity
class for one request handler and another for a separate request handler. Is
that correct?



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SweetSpotSimilarity-tp922546p922622.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SweetSpotSimilarity

2010-06-25 Thread Ahmet Arslan

 Thanks. Im guessing this is all or nothing.. ie you can't
 you one similarity
 class for one request handler and another for a separate
 request handler. Is
 that correct?

correct, also re-index is required. length norms are calculated and stored at 
index time.

RE: solr indexing takes a long time and is not reponsive to abort command

2010-06-25 Thread Ya-Wen Hsu

Thanks for the response. I double-checked that we don't have the core open 
multiple times. The complete index size is about 200M (around 1,060,000 
documents).  During the indexing process, 26 files were created. Core admin 
interface indicated that no query or process were running after roughly 5 hours 
but the Time Elapsed was still going. We have the indexDefults setting as 
followed:

useCompoundFilefalse/useCompoundFile
mergeFactor10/mergeFactor

Do you thinking lower mergeFactor to 5 and set useCompoundfile to true would 
help? I'll try it out on Monday.

Thanks again!


-Original Message-
From: Don Werve [mailto:d...@madwombat.com] 
Sent: Thursday, June 24, 2010 9:09 PM
To: solr-user@lucene.apache.org
Subject: Re: solr indexing takes a long time and is not reponsive to abort 
command

2010/6/25 Ya-Wen Hsu y...@eline.com

 This situation doesn't happen consistently. When we only ran the
 problematic core, the indexing took significant longer than usual(4hrs - 11
 hrs). It ran successful in the end. When we ran indexing for all cores at
 the same time, the problematic core never finished indexing such that we
 have to kill the process. This happened twice already. I'm running it
 parallel again to see if the problem still persists.


Off the top of my head:

Have you accidentally opened this core multiple times within the same JVM?
 I had the same thing happen to me when I was testing out a Solr interface I
had written under JRuby; that was loads of fun to track down...

How physically large is the core ('du -sh' if you're on Unix), and how many
files does the index contain?  I've run into issues where frequent updates
created a lot of index files, and which slowed down all core access.

If you've got a lot of index files, has the problem core been optimized?

Re: Debugging Queries

2010-06-25 Thread Otis Gospodnetic

Frank: http://www.getopt.org/luke/

 

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Frank A fsa...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 25, 2010 1:23:37 PM
 Subject: Debugging Queries
 
 I have a query that is not returning the results I expect - as in there 
 are
missing results.  Is there a way given an ID to the index field to 
 dive into
how the entity is stored in the index?

Thanks.

Re: XML DataImportHandler copy + rezise pictures in localhost?

2010-06-25 Thread Otis Gospodnetic

Marc,

Why not use http://www.imagemagick.org/script/index.php to generate thumbnails 
separately from document indexing?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: scr...@asia.com scr...@asia.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 25, 2010 4:12:02 AM
 Subject: XML DataImportHandler copy + rezise pictures in localhost?
 
 


Hi,

I'm adding documents to Solr via XML files and 
 DataImportHandler.

In the XML file i've got some product picture 
 links:


picture

 picture_url
 http://www.example.com/pic.jpg/picture_url
/picture

I 
 would like to keep a local thumb of these picture in local server in order to 
 avoid long external loading time.

Example:

Original picture: 
 

 http://www.example.com/pic.jpg is 800x600px

== 
 conversion

Local picture:


 target=_blank http://localhost/pic.jpg in 100x100px



Is there 
 a way to do this?

Thanks for your help.

Marc

Re: dataimport.properties is not updated on delta-import

2010-06-25 Thread Alexey Serba

Please note that Oracle ( or Oracle jdbc driver ) converts column
names to upper case eventhough you state them in lower case. If this
is the case then try to rewrite your query in the following form
select id as id, name as name from table

On Thursday, June 24, 2010, warb w...@mail.com wrote:

 Hello again!

 Upon further investigation it seems that something is amiss with
 delta-import after all, the delta-import does not actually import anything
 (I thought it did when I ran it previously but I am not sure that was the
 case any longer.) It does complete successfully as seen from the front-end
 (dataimport?command=delta-import). Also in the logs it is stated the the
 import was successful (INFO: Delta Import completed successfully), but there
 are exception pertaining to some documents.

 The exception message is that the id field is missing
 (org.apache.solr.common.SolrException: Document [null] missing required
 field: id). Now, I have checked the column names in the table, the
 data-config.xml file and the schema.xml file and they all have the
 column/field names written in lowercase and are even named exactly the same.

 Do Solr rollback delta-imports if one or more of the documents failed?
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/dataimport-properties-is-not-updated-on-delta-import-tp916753p919609.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Setting many properties for a multivalued field. Schema.xml ? External file?

2010-06-25 Thread Saïd Radhouani

Hi,

I'm trying to index data containing a multivalued field picture, that has 
three properties: url, caption and description:

picture/ 
url/
caption/
description/

Thus, each indexed document might have many pictures, each of them has a url, a 
caption, and a description.

I wonder wether it's possible to store this data using only schema.xml. I 
couldn't figure it out so far. Instead, I'm thinking of using an external file 
to sore the properties of each picture, but I haven't tried yet this solution, 
waiting for your suggestions...

Thanks,
-Saïd

indexing xml document with literals

2010-06-25 Thread Kyle Langan

Does anyone know how to read in data from one or more of the example xml 
docs and ALSO store the filename and path from which it came?


ie:  exampledocs/vidcard.xml
contains:
add
doc
field name=idEN7800GTX/2DHTV/256M/field
field name=nameASUS Extreme N7800GTX/2DHTV (256 MB)/field
 
/doc
doc
field name=id100-435805/field
field name=nameATI Radeon X1900 XTX 512 MB PCIE Video Card/field

/doc
/add

Two questions:
once the data gets indexed by solr, is there anything we can use to know 
that this data came from that file? ie, what was the name and location 
of the file that holds the data. I need access to the path and filename 
of the xml file containing the entries when searching.


and is there anyway to append information to xml data being indexed 
through the query parameters like there is with the 
ExtractingRequestHandler.
like literal.id=x;literal.filename=vidcard..xml  or does all this 
information have to be in the particular doc in question.


thanks so much for any help on this.

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

2010-06-25 Thread Otis Gospodnetic

Saïd,

Dynamic fields could help here, for example imagine a doc with:
id
 pic_url_*
 pic_caption_*
 pic_description_*

See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields

So, for you:

dynamicField name=pic_url_*  type=string  indexed=true  stored=true/
dynamicField name=pic_caption_*  type=text  indexed=true  stored=true/
dynamicField name=pic_description_*  type=text  indexed=true  
stored=true/

Then you can add docs with unlimited number of pic_(url|caption|description)_* 
fields, e.g.

id
pic_url_1
pic_caption_1
pic_description_1

id
pic_url_2
pic_caption_2
pic_description_2


Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Saïd Radhouani r.steve@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 25, 2010 6:01:13 PM
 Subject: Setting many properties for a multivalued field. Schema.xml ? 
 External file?
 
 Hi,

I'm trying to index data containing a multivalued field picture, 
 that has three properties: url, caption and description:

picture/ 
 
url/

 caption/
description/

Thus, each 
 indexed document might have many pictures, each of them has a url, a caption, 
 and a description.

I wonder wether it's possible to store this data using 
 only schema.xml. I couldn't figure it out so far. Instead, I'm thinking of 
 using 
 an external file to sore the properties of each picture, but I haven't tried 
 yet 
 this solution, waiting for your suggestions...

Thanks,
-Saïd

Re: SOLR-236 Patch

XML DataImportHandler copy + rezise pictures in localhost?

Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

Re: performance sorting multivalued field

DIH - $deleteDocById

Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

[ANN] Solr 1.4.1 Released

Re: Recommended MySQL JDBC driver

Re: [ANN] Solr 1.4.1 Released

Re: [ANN] Solr 1.4.1 Released

Re: SweetSpotSimilarity

Debugging Queries

Re: SweetSpotSimilarity

Re: SweetSpotSimilarity

RE: solr indexing takes a long time and is not reponsive to abort command

Re: Debugging Queries

Re: XML DataImportHandler copy + rezise pictures in localhost?

Re: dataimport.properties is not updated on delta-import

Setting many properties for a multivalued field. Schema.xml ? External file?

indexing xml document with literals

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

21 matches

Site Navigation

Mail list logo

Footer information