the replication relies on lucene API to know what are the files
associated with an index version. If it returns the lock file also it
is replicated too.
I guess we must ignore the .lock file if it is returned in the list of files.
you can raise an issue and we can fix it.
--Noble
On Fri, May
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi all,
Sol Lederman has written a Review
http://federatedsearchblog.com/2009/05/14/review-faceted-search/ of
the almost finished manuscript of Daniel Tunkelang's Faceted Search
which is set to be published in June. As the text also mentions Solr it
Out of the box, the simplest way to configure CommonsHttpSolrServer
through a spring application context is to simply define the bean for the
server and inject it into whatever class you have that will use it, like
Avlesh shared below.
bean id=httpSolrServer
On Thu, May 14, 2009 at 8:36 PM, Mark Miller markrmil...@gmail.com wrote:
Michael McCandless wrote:
So why haven't we enabled this by default, already?
Why isn't Lucene done already :)
I hear you :)
Mike
Hello
I have a field defined in schema.xml as an integer which should
contain either 0,1,2,10 or 11 values but my results documents are
showing this as either 'true' or 'false'. the majority of the half
million documents have this field as 0 or 1 but around 6,000 have it
as 2,10 or 11. The
Dear community,
I'm wondering if there is a clean solution to my rather interesting problem.
The following facet query results in a list of all facets and the number of
all documents matching the corresponding facet as seen below:
Query:
str name=q*:*/str
str name=facet.limit5/str
str
In the spirit of good defaults:
I think we should change the Solr highlighter to highlight phrase
queries by default, as well as prefix,range,wildcard constantscore
queries. Its awkward to have to tell people you have to turn those on.
I'd certainly prefer to have to turn them off if I have
On May 14, 2009, at 8:46 PM, Chris Miller wrote:
1) How do I search for ALL items? For example, I provide a sort
query parameter of updated and a rows query parameter of 10 to
limit the query results. I still have to provide a search query, of
course. What if I want to provide a list of
No. This patch not help in case, when data is not HTML, but is parsed by
HTMLStripReader.
Look like we need just fine tuned try/catch in code. To catch only non-HTML
data case.
On Tue, May 12, 2009 at 6:05 PM, Yonik Seeley yo...@lucidimagination.comwrote:
I just committed a minor match
Something that would be interesting is to share solr configs for
various types of indexing tasks. From a solr configuration aimed at
indexing web pages to one doing large amounts of text to one that
indexes specific structured data. I could see those being posted on
the wiki and helping
Hello,
I did just find only post about updating document, maybe things evolved
since that time.
I need to update a field in few thousand documents in one time (or multiple
request), but I wouldn't like to have to add a new document instead of the
current one (I mean it's how it works if I well
Hello,
What we have done is created multiple solr instances on the same server,
where each instance is created with the DataImportHandler from a different
DB. The information on each DB is similar, so the schema's for each
instance are pretty much the same. Our goal is to use the shards
Hello all,
I've got some weird problem with a simple field search.
The field facility_indexed has the following terms:
- kooklessen (freq: 422)
- workshop (freq: 422)
These terms were tokenized from the string: Kooklessen en Workshops. So
during insertion in Solr, the string was succesfully
Hello List,
I am having the below query
art_id:queryTextstart=0rows=10sort=score desc and this should not yield
any result because art_id contains numbers.
But when I execute this search , it returns more than 100 documents. the
art_id field is String in schema.xml
Can anyone tell me how
Hi,
I found the why it is returning irrelevant documents. I am encoding my query
string with UTF-8 and appending to url as follows so it fails.
This is the query string = art_id:queryTextstart=0rows=10sort=score desc
encoded url :
https://issues.apache.org/jira/browse/SOLR-1170
-Bryan
On May 15, 2009, at May 15, 12:24 AM, Noble Paul നോബിള്
नोब्ळ् wrote:
the replication relies on lucene API to know what are the files
associated with an index version. If it returns the lock file also it
is replicated too.
I
Bryan Talbot schrieb:
So how are people managing solrconfig.xml files which are largely the
same other than differences for replication?
I don't think it's a good thing to maintain two copies of the same
file and I'd like to avoid that. Maybe enabling the XInclude feature
in DocumentBuilders
I agree regarding posting different types of files - because right now
if you're just starting out with Solr, taking the sample files from
the distro and going from there is the /only path/ =\
Thanks for your time!
Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833
Hi All,
I've got here a small problem about replication.
Let's say I post a document on the master server, and the slaves do
a snappuller/installer via crontab every 1 minutes.
Then between in average 30 seconds, all my search servers are not
synchronized.
Is there a way to improve
Certainly does seem strange.
Do you have the same uniqueKeyField in both indexes?
Any way you can provide some configuration and some data to reproduce this?
-Yonik
On Fri, May 15, 2009 at 10:40 AM, CB-PO charles.bush...@gmail.com wrote:
Hello,
What we have done is created multiple solr
You'd have to have extra hardware. You'd pull out some number of servers out
of service while they are being updated. Then you'd put them back in service
and take the other half out, update, and put them back in.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
-
Hi All,
I am trying to index the fileds from the xml files, here is the
configuration that I am using.
db-data-config.xml
dataConfig
dataSource type=FileDataSource name =xmlindex/
document name=products
entity name=xmlfile processor=FileListEntityProcessor
Yeah, the first thing I thought of was that perhaps there was something wrong
with the uniqueKey and they were clashing between the indexes, however upon
visual inspection of the data the field we are using as the unique key in
each of the indexes is grossly different between the two databases,
If that is your complete input file then it looks like you are missing the
wrapping add/add element:
add
doc
field name=idF8V7067-APL-KIT/
field
field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field
field name=manuBelkin/field
field name=catelectronics/field
field
Many thanks for the reply
The complete input xml file is below I missed to include this earlier.
add
doc
field name=idF8V7067-APL-KIT/field
field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field
field name=manuBelkin/field
field name=catelectronics/field
field
On Fri, May 15, 2009 at 4:11 PM, CB-PO charles.bush...@gmail.com wrote:
Yeah, the first thing I thought of was that perhaps there was something wrong
with the uniqueKey and they were clashing between the indexes, however upon
visual inspection of the data the field we are using as the unique
Hi,
I'm experimenting with highlighting and am noticing a big drop in
performance with my setup. I have documents that use quite a few dynamic
fields (20-30). The fields are multiValued stored/indexed text fields, each
with a few paragraphs worth of text. My hl.fl param is set to *_t
What kinds
Is there a built-in mechanism for grouping similar documents together in the
response? I'd like to make it look like there is only one document with
multiple hits.
Matt
Collapse component may be of interest to you
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
On Fri, May 15, 2009 at 3:52 PM, Matt Mitchell goodie...@gmail.com wrote:
Is there a built-in mechanism for grouping similar documents
Hello,
I'm having a problem with a query but I don't understand what is wrong
with it. Can someone explain the following?
Here are a few queries that work as expected (premium is a boolean
field):
premium:false-3004
premium:true -0
-premium:false - 0
Matt - you may also want to detect near duplicates at index time:
http://wiki.apache.org/solr/Deduplication
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Matt Mitchell goodie...@gmail.com
To: solr-user@lucene.apache.org
Sent: Friday,
Matt,
I believe indexing those fields that you will use for highlighting with term
vectors enabled will make things faster (and your index a bit bigger).
Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Matt Mitchell goodie...@gmail.com
Hi Jeffrey,
And now try: ?q=facility_indexed:kooklessen en workshops~1
If that works, head over to the Solr Admin Analysis page, enter the field name,
and that phrase for both index and query analyzer. And then look at term
positions for your two main terms/tokens.
Otis --
Sematext --
Vincent,
Unfortunately things haven't changed yet. If all your fields are stored, have
a look at SOLR-139.
Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Vincent Pérès vincent.pe...@gmail.com
To: solr-user@lucene.apache.org
Sent:
Sachin,
EmbeddedSolrServer implies an embedded, local, in-process access to Solr.
CommonsHttpSolrServer lets you access a remote Solr instance via HTTP.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: sachin78 tendulkarsachi...@gmail.com
Jack,
Which bug are you referring to? Last time I played with function queries with
date fields things worked as expected. If there is/was a known bug, it must be
in JIRA...
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Jack Godwin
Some more info,
Profiling the heap dump shows
org.apache.lucene.index.ReadOnlySegmentReader as the biggest object
- taking up almost 80% of total memory (6G) - see the attached screen
shot for a smaller dump. There is some norms object - not sure where
are they coming from as I've
37 matches
Mail list logo