Hi-
I'd like to make a multivalued field of comma-separated phrases. Is there a
class available that I can use for this?
I can see how to create N separate elements for the same field in the update
XML, but is there something I can use in type definition?
Thanks,
Lance
is, it is exactly the same as:
+a:valueAlpha +a:valueBeta +a:valueGamma
I have to use OR between the values.
Is this supposed to be true?
Thanks,
Lance
-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: Wednesday, August 01, 2007 12:48 AM
To: solr-user@lucene.apa
ollection:pile1 OR collection:pile2)
When we apply De Morgan's Law, we get 0 records:
text (-collection:pile1 AND -collection:pile2)
This should return all records, but it returns nothing:
text (-collection:pile1 OR -collection:pile2)
Thanks,
Lance
A simplified version of the problem:
text -(collection:pile1)
works, while
text (-collection:pile1)
finds zero records.
lance
_
From: Lance Lance [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 12, 2007 5:58 PM
To: 'solr-user@lucene.apache.org'
Subject: Questio
Ok, here's a simpler version:
_
From: Lance Lance [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 12, 2007 5:58 PM
To: 'solr-user@lucene.apache.org'
Subject: Question on query syntax
Are there any known bugs in the syntax parser? We're using lucene-2.2.0 and
HI,
Could you please help me with a quick question - Is there a way to restrict
lucene/solr fuzzy search to only analyze words that have more than 5 characters
and to ignore words with less than that (i.e. less than 6 character words)?
Thanks
-
Lance
HI,
Could you please help me with a quick question - Is there a way to restrict
lucene/solr fuzzy search to only analyze words that have more than 5 characters
and to ignore words with less than that (i.e. less than 6 character words)?
Thanks
-
Lance
blem solving and analytical
abilities. You must have a solid grasp of English – written and verbal.
Please note that I am a start-up and I am not going to be able to pay what a
large established company can pay.
Thank you,
Lance
-----
Lance
Thanks Markus,
I did look at that list, but I'm wondering if there is anyone who is not on the
list who may be interested.
-
Lance
On May 17, 2011, at 4:09 PM, Markus Jelsma wrote:
Check this out:
http://wiki.apache.org/solr/Support
> Hi,
>
>
Thanks Shashi - there aren't too many qualified people on those sites - I have
looked.
-
Lance
On May 17, 2011, at 4:13 PM, Shashi Kant wrote:
You might be better off looking for freelancers on sites such as
odesk.com, guru.com, rentacoder.com, elanc
On 10/13/2013 10:02 AM, Shawn Heisey wrote:
On 10/13/2013 10:16 AM, Josh Lincoln wrote:
I have a large solr response in xml format and would like to import it into
a new solr collection. I'm able to use DIH with solrEntityProcessor, but
only if I first truncate the file to a small subset of the
Can you do this data in CSV format? There is a CSV reader in the DIH.
The SEP was not intended to read from files, since there are already
better tools that do that.
Lance
On 10/14/2013 04:44 PM, Josh Lincoln wrote:
Shawn, I'm able to read in a 4mb file using SEP, so I think that rule
gt;
>
>
>
>
>
>
> And field declared for this analyzer:
>
> omitNorms="true" omitPositions="true"/>
>
>
>
> Problem is here : When I search over this field Detail_Person, results are
> not constant.
>
>
>
> When I search Detail_Person:brett, it return one document
>
>
>
>
>
> But again when I fire the same query, it return zero document.
>
>
>
> Searching is not stable on OpenNLP field, sometimes it return documents
> and sometimes not but documents are there.
>
> And if I search on non OpenNLP fields, it is working properly, results are
> stable and correct.
>
> Please help me to make solr results consistent.
>
> Thanks in Advance.
>
>
--
Lance Norskog
goks...@gmail.com
on garbage collection.
Lance
On 11/22/2013 05:27 AM, Martin de Vries wrote:
We did some more monitoring and have some new information:
Before
the issue happens the garbage collector's "collection count" increases a
lot. The increase seems to start about an hour before the r
and add the payloads. ( but iam not able to analyze it)
>
> My Question is:
> Can i search a phrase giving high boost to NOUN then VERB ?
> For example: if iam searching "sitting on blanket" , so i want to give high
> boost to NOUN term first then VERB, that are tagged by OpenNLP.
> How can i use payloads for boosting?
> What are the changes required in schema.xml?
>
> Please provide me some pointers to move ahead
>
> Thanks in advance
>
--
Lance Norskog
goks...@gmail.com
I do not know what causes the error. This setup will not work. You need
one or three zookeepers. SolrCloud demands that a majority of the ZK
servers agree. If you have two ZKs this will not work.
On 06/29/2013 05:47 AM, Sagar Chaturvedi wrote:
Hi,
I setup 2 solr instances on 2 different mach
s, you can keep some queries cached longer than
your timeout.
Lance
On 06/29/2013 05:51 PM, William Bell wrote:
On a large website, by putting 1 varnish in front of all 4 SOLR boxes we
were able to trim 25% off the load time (TTFB) of the page.
Our hit ratio was between 55 and 75%. We gave varni
This usually means the end server timed out.
On 06/30/2013 06:31 AM, Shahar Davidson wrote:
Hi all,
We're getting the below exception sporadically when using distributed search.
(using Solr 4.2.1)
Note that 'core_3' is one of the cores mentioned in the 'shards' parameter.
Any ideas anyone?
T
The MappingCharFilter allows you to map both characters to one
characters. If you do this during indexing and querying, searching with
one should find the other. This is sort of like synonyms, but on a
character-by-character basis.
Lance
On 06/18/2013 11:08 PM, Yash Sharma wrote:
> Hi,
>
Also, total index file size. At 200-300gb managing an index becomes a pain.
Lance
On 07/08/2013 07:28 AM, Jack Krupansky wrote:
Other that the per-node/per-collection limit of 2 billion documents
per Lucene index, most of the limits of Solr are performance-based
limits - Solr can handle it
Norms stay in the index even if you delete all of the data. If you just
changed the schema, emptied the index, and tested again, you've still
got norms in there.
You can examine the index with Luke to verify this.
On 07/09/2013 08:57 PM, William Bell wrote:
I have a field that has omitNorms=t
I don't know about jvm crashes, but it is known that the Java 6 jvm had
various problems supporting Solr, including the 20-30 series. A lot of
people use the final jvm release (I think 6_30).
On 07/16/2013 12:25 PM, neoman wrote:
Hello Everyone,
We are using solrcloud with Tomcat in our produc
Are you feeding Graphite from Solr? If so, how?
On 07/19/2013 01:02 AM, Neil Prosser wrote:
That was overnight so I was unable to track exactly what happened (I'm
going off our Graphite graphs here).
Solr/Lucene does not automatically add when asked, the way DBMS systems
do. Instead, all data for a field is added at the same time. To get the
new field, you have to reload all of your data.
This is also true for deleting fields. If you remove a field, that data
does not go away until you re-
Cool!
On 08/05/2013 03:34 AM, Charlie Hull wrote:
On 03/08/2013 00:50, Mark wrote:
We have a set number of known terms we want to match against.
In Index:
"term one"
"term two"
"term three"
I know how to match all terms of a user query against the index but
we would like to know how/if we ca
scalable implementation of n-gram based document
similarity. It calculates distances between all documents and identifies
clusters of similar documents. This is a much more general technique and
may help you find "obfuscated" plagiarism.
Lance
On 07/23/2013 02:33 AM, Furkan KAMACI
viewer. This will give you #1 and #3.
Lance
On 08/21/2013 09:00 AM, jiunarayan wrote:
I have a svn respository and svn file path. How can I SOLR search content on
the svn file.
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-SOLR-file-in-svn-repository-tp4085904.html
Solr does not by default generate unique IDs. It uses what you give as
your unique field, usually called 'id'.
What software do you use to index data from your RSS feeds? Maybe that
is creating a new 'id' field?
There is no partial update, Solr (Lucene) always rewrites the complete
document.
data is a pain in the neck to administer.
As always, every index is different, but you should not have problems
doing the merge that you describe.
Lance
On 09/08/2013 09:01 PM, diyun2008 wrote:
Thank you Erick. It's very useful to me. I have already started to merge logs
of collecti
Seconded. Single-stepping really is the best way to follow the logic
chains and see how the data mutates.
On 04/05/2013 06:36 AM, Erick Erickson wrote:
Then there's my lazy method. Fire up the IDE and find a test case that
looks close to something you want to understand further. Step through
it
Outer distance AND NOT inner distance?
On 04/12/2013 09:02 AM, kfdroid wrote:
We currently do a radius search from a given Lat/Long point and it works
great. I have a new requirement to do a search on a larger radius from the
same point, but not include the smaller radius. Kind of a donut (toru
Run checksums on all files in both master and slave, and verify that
they are the same.
TCP/IP has a checksum algorithm that was state-of-the-art in 1969.
On 04/18/2013 02:10 AM, Victor Ruiz wrote:
Also, I forgot to say... the same error started to happen again.. the index
is again corrupted :(
Great! Thank you very much Shawn.
On 05/04/2013 10:55 AM, Shawn Heisey wrote:
On 5/4/2013 11:45 AM, Shawn Heisey wrote:
Advance warning: this is a long reply.
I have condensed some relevant performance problem information into the
following wiki page:
http://wiki.apache.org/solr/SolrPerforman
If this is for the US, remove the age range feature before you get sued.
On 05/09/2013 08:41 PM, Kamal Palei wrote:
Dear SOLR experts
I might be asking a very silly question. As I am new to SOLR kindly guide
me.
I have a job site. Using SOLR to search resumes. When a HR user enters some
keywor
This is great; data like this is rare. Can you tell us any hardware or
throughput numbers?
On 05/17/2013 12:29 PM, Rishi Easwaran wrote:
Hi All,
Its Friday 3:00pm, warm & sunny outside and it was a good week. Figured I'd
share some good news.
I work for AOL mail team and we use SOLR for our
If the indexed data includes positions, it should be possible to
implement ^ and $ as the first and last positions.
On 05/22/2013 04:08 AM, Oussama Jilal wrote:
There is no ^ or $ in the solr regex since the regular expression will
match tokens (not the complete indexed text). So the results yo
I will look at these problems. Thanks for trying it out!
Lance Norskog
On 05/28/2013 10:08 PM, Patrick Mi wrote:
Hi there,
Checked out branch_4x and applied the latest patch
LUCENE-2899-current.patch however I ran into 2 problems
Followed the wiki page instruction and set up a field with
Let's assume that the Solr record includes the database record's
timestamp field.You can make a more complex DIH stack that does a Solr
query with the SolrEntityProcessor. You can do a query that gets the
most recent timestamp in the index, and then use that in the DB update
command.
On 06/02
Distributed search does the actual search twice: once to get the scores
and again to fetch the documents with the top N scores. This algorithm
does not play well with "deep searches".
On 06/02/2013 07:32 PM, Niran Fajemisin wrote:
Thanks Daniel.
That's exactly what I thought as well. I did tr
, the example on the wiki is wrong. The FilterPayloadsFilter
default is to remove the given payloads, and needs keepPayloads="true"
to retain them.
The fixed patch is up as LUCENE-2899-x.patch. Again, thanks for trying it.
Lance
https://issues.apache.org/jira/browse/LUCENE-2899
On 05/2
text_opennlp has the right behavior.
text_opennlp_pos does what you describe.
I'll look some more.
On 06/09/2013 04:38 PM, Patrick Mi wrote:
Hi Lance,
I updated the src from 4.x and applied the latest patch LUCENE-2899-x.patch
uploaded on 6th June but still had the same problem.
Re
Found the problem. Please see:
https://issues.apache.org/jira/browse/LUCENE-2899?focusedCommentId=13679293&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13679293
On 06/09/2013 04:38 PM, Patrick Mi wrote:
Hi Lance,
I updated the src from 4.x and applied the la
In 4.x and trunk is a close() method on Tokenizers and Filters. In
currently released up to 4.3, there is instead a reset(stream) method
which is how it resets a Tokenizer&Filter for a following document in
the same upload.
In both cases I had to track the first time the tokens are consumed, a
No, they just learned a few features and then stopped because it was
"good enough", and they had a thousand other things to code.
As to REST- yes, it is worth having a coherent API. Solr is behind the
curve here. Look at the HATEOS paradigm. It's ornate (and a really goofy
name) but it provide
One small thing: German u-umlaut is often "flattened" as 'ue' instead of
'u'. And the same with o-umlaut, it can be 'oe' or 'o'. I don't know if
Lucene has a good solution for this problem.
On 06/16/2013 06:44 AM, adityab wrote:
Thanks for the explanation Steve. I now see it clearly. In my cas
Accumulo is a BigTable/Cassandra style distributed database. It is now
an Apache Incubator project. In the README we find this gem:
"Synchronize your accumulo conf directory across the cluster. As a
precaution against mis-configured systems, servers using different
configuration files will not
ssage in context:
> http://lucene.472066.n3.nabble.com/Solr-Geodist-tp3287005p3297088.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Lance Norskog
goks...@gmail.com
ith the help of solr.
>>>
>>> There are following points about catgory and products to be considered,
>>> 1.One product can belong to more than one categories.
>>> 2.category is a hierarchical facet.
>>> 3.More than one categories can share same name.
>>>
>>> It would be a great help if someone can suggest a way to index and query
>>> data based on the above architecture.
>>>
>>> Thanks,
>>> Priti
>>>
>>>
>
--
Lance Norskog
goks...@gmail.com
org.apache.lucene.store.MMapDirectory$MMapIndexInput.(Unknown
> >>> > Source)
> >>> > at
> org.apache.lucene.store.MMapDirectory$MMapIndexInput.(Unknown
> >>> > Source)
> >>> > at org.apache.lucene.store.MMapDirectory.openInput(Unknown Source)
> >>> > at org.apache.lucene.index.SegmentReader$CoreReaders.(Unknown
> >>> Source)
> >>> >
> >>> > at org.apache.lucene.index.SegmentReader.get(Unknown Source)
> >>> > at org.apache.lucene.index.SegmentReader.get(Unknown Source)
> >>> > at org.apache.lucene.index.DirectoryReader.(Unknown Source)
> >>> > at org.apache.lucene.index.ReadOnlyDirectoryReader.(Unknown
> >>> Source)
> >>> > at org.apache.lucene.index.DirectoryReader$1.doBody(Unknown Source)
> >>> > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(Unknown
> >>> > Source)
> >>> > at org.apache.lucene.index.DirectoryReader.open(Unknown Source)
> >>> > at org.apache.lucene.index.IndexReader.open(Unknown Source)
> >>> > ...
> >>> > Caused by: java.lang.OutOfMemoryError: Map failed
> >>> > at sun.nio.ch.FileChannelImpl.map0(Native Method)
> >>> > ...
> >>>
> >>>
> >>
> >
>
--
Lance Norskog
goks...@gmail.com
I remember now: by memory-mapping one block of address space that big, the
garbage collector has problems working around it. If the OOM is repeatable,
you could try watching the app with jconsole and watch the memory spaces.
Lance
On Thu, Sep 8, 2011 at 8:58 PM, Lance Norskog wrote:
> Do
http://aws.amazon.com/datasets
DBPedia might be the easiest to work with:
http://aws.amazon.com/datasets/2319
Amazon has a lot of these things.
Infochimps.com is a marketplace for free & pay versions.
Lance
On Thu, Sep 15, 2011 at 6:55 PM, Pulkit Singhal wrote:
> Ah missing } doh!
&g
r
>
> SEVERE: java.lang.ClassCastException:
> org.apache.solr.analysis.SmartChineseWordTokenFilterFactory cannot be cast
> to org.apache.solr.analysis.TokenizerFactory
>
>
> Any thought?
--
Lance Norskog
goks...@gmail.com
server, we INCREASED speed by
> REDUCING the number of cores/threads each query was allowed to use (making
> sense of our customer investment)
> maybe you can get a similar effect by reducing the number of pieces your
> distributed search has to merge
>
> my 2 eurocents
>
> federico
>
--
Lance Norskog
goks...@gmail.com
aImporter.doFullImport(DataImporter.java:372)
>at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:440)
>at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:421)
>
--
Lance Norskog
goks...@gmail.com
> indeed, be from the query that filled
>>>>>> in the HTTP cache. But what are you doing
>>>>>> with that information that you want to "correct"
>>>>>> it?
>>>>>>
>>>>>> That said, I have no clue how you'd attempt to
>>>>>> do this.
>>>>>>
>>>>>> Best
>>>>>> Erick
>>>>>>
>>>>>> On Sat, Oct 1, 2011 at 5:55 PM, Lord Khan Han>>>>> >
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Is there anyway to get correct Qtime when we use http caching ? I
>>>>>>>
>>>>>> think
>>>>
>>>>> Solr
>>>>>>
>>>>>>> caching also the Qtime so giving the the same Qtime in response what
>>>>>>>
>>>>>> ever
>>>>
>>>>> takes it to finish .. How I can set Qtime correcly from solr when I
>>>>>>>
>>>>>> use
>>>>
>>>>> http caching On.
>>>>>>>
>>>>>>> thanks
>>>>>>>
>>>>>>>
>>
--
Lance Norskog
goks...@gmail.com
generally easiest to use the solr/example 'java -jar start.jar'
example to test out features. It is easy to break configuration linkages.
Lance
On Thu, Oct 13, 2011 at 12:42 PM, Jeremy Cunningham <
jeremy.cunningham.h...@statefarm.com> wrote:
> I am new to solr and not a web deve
ases are covered..."
>
> ...i thought there was a DIH FAQ about this, but if not there really
> should be.
>
>
> -Hoss
>
--
Lance Norskog
goks...@gmail.com
Yes, please open a JIRA for this, with as much info as possible.
Lance
On Thu, Nov 3, 2011 at 9:48 AM, P Williams
wrote:
> Hi All,
>
> I'm experiencing a similar problem to the other's in the thread.
>
> I've recently upgraded from apache-solr-4.0-2011-06-14_08-
> >
> > > > > I am a solr newbie. I find solr documents easy to access and use,
> > > which
> > > > is
> > > > > really good thing. While my problem is I did not find a solr home
> > > grown
> > > > > profiling/monitoring tool.
> > > > >
> > > > > I set up the server as a multi-core server, each core has
> > > approximately
> > > > 2GB
> > > > > index. And I need to update solr and re-generate index in a real
> time
> > > > > manner (In java code, using SolrJ). Sometimes the update operation
> is
> > > > slow.
> > > > > And it is expected that in a year, the index size may increase to
> > 4GB.
> > > > And
> > > > > I need to do something to prevent performance downgrade.
> > > > >
> > > > > Is there any solr official monitoring & profiling tool for this?
> > > > >
> > > > > Spark
> > > >
> > > >
> >
>
--
Lance Norskog
goks...@gmail.com
-4-index-at-4-0-trunk-tp3550430p3550430.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Lance Norskog
goks...@gmail.com
ks for sharing. I'm not sure it
> does exactly what I want though. I think it is more for checking if the two
> docs are the same, which for my purposes, the url works fine for.
>
> I think I've sort of come to realise that generating a uuid from the url
> might be the way to go. There is a chance of getting the same uuid from
> different urls, but it's only 1 in 2^128, so it's basically non-existant.
>
> Thanks again,
> Blaise
--
Lance Norskog
goks...@gmail.com
http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/UniqueKey
On Wed, Dec 7, 2011 at 5:04 PM, Lance Norskog wrote:
> Yes, the SignatureUpdateProcessor is what you want. The 128-bit hash is
> exactly what you want to use in this situation. You will never get the
sections of matching or nearly matching text
> in documents. Does anyone have any experience in this area that they would be
> willing to share?
> Thanks,
> Mike
--
Lance Norskog
goks...@gmail.com
gt; >> > Tried google but I couldn't find a solution there althoght many people
>> >> > encounted such problem.
>> >> >
>> >> >
>> >> it's definitely can be done by overriding
>> >> o.a.s.update.DirectUpdateHandler2.addDoc(AddUpdateCommand), but I
>> suggest
>> >> to start from implementing your own
>> >> http://wiki.apache.org/solr/UpdateRequestProcessor - search for PK,
>> bypass
>> >> chain call if it's found. Then if you meet performance issues on
>> querying
>> >> your PKs one by one, (but only after that) you can batch your searches,
>> >> there are couple of optimization techniques for huge disjunction queries
>> >> like PK:(2 OR 4 OR 5 OR 6).
>> >>
>> >>
>> >> > I start considering that I must query index to check if a doc to be
>> added
>> >> > is in the index already and do not add it to array but I have so many
>> >> docs
>> >> > that I am affraid it's not a good solution.
>> >> >
>> >> > Best Regards
>> >> > Alexander Aristov
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Sincerely yours
>> >> Mikhail Khludnev
>> >> Lucid Certified
>> >> Apache Lucene/Solr Developer
>> >> Grid Dynamics
>> >>
>>
--
Lance Norskog
goks...@gmail.com
>
> Regards,
>
>
> Vibhor
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-run-the-solr-dedup-for-the-document-which-match-80-or-match-almost-tp3614239p3615787.html
> Sent from the Solr - User mailing list archive at Nabble.com.
--
Lance Norskog
goks...@gmail.com
, introduce better
memory management and a lot more. For your production upgrade you
should translate your local changes into a fresh 3.5 instance.
Lance
On Wed, Dec 28, 2011 at 5:23 AM, Bhavnik Gajjar wrote:
> Thanks community! That helps!
>
> To check practically, I have now setup So
r seems to handle 100g-200g fine on modern hardware.
Lance
On Fri, Dec 23, 2011 at 1:54 AM, Nick Vincent wrote:
> For data of this size you may want to look at something like Apache
> Cassandra, which is made specifically to handle data at this kind of
> scale across many machines.
>
>
om/Filtered-search-for-subset-of-ids-tp502245p3637150.html
> Sent from the Solr - User mailing list archive at Nabble.com.
--
Lance Norskog
goks...@gmail.com
LUCENE_23
>
>>
>> In Lucene I use an untweaked org.apache.lucene.analysis.de.GermanAnalyzer.
>>
>> What is an equivalent fieldType definition in Solr 3.5?
>
>
>
>
>
> --
> lucidimagination.com
--
Lance Norskog
goks...@gmail.com
ich service to go
>> with for solr Cloud Indexing ?
>>
>> Any good and tried services?
>>
>> Regards
>> Sujatha
--
Lance Norskog
goks...@gmail.com
rch.suggest.fst.FSTLookup.build(FSTLookup.java:179)
>>> > at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70)
>>> > at org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133)
>>> > at org.apache.solr.spelling.suggest.Suggester.reload(Suggester.java:153)
>>> > at
>>> >
>>> org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener.newSearcher(SpellCheckComponent.java:675)
>>> > at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1181)
>>> > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>> > at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> > at java.lang.Thread.run(Thread.java:662)
>>> >
>>> > Jan 16, 2012 4:06:15 PM org.apache.solr.core.SolrCore registerSearcher
>>> > INFO: [places] Registered new searcher Searcher@34b0ede5 main
>>> >
>>> >
>>> >
>>> > Basically this means once I've run a full-import, I cannot exit the SOLR
>>> > process because I receive this error no matter what when I restart the
>>> > process. I've tried with different -Xmx arguments, and I'm really at a
>>> loss
>>> > at this point. Is there any guideline to how much RAM I need? I've got
>>> 8GB
>>> > on this machine, although that could be increased if necessary. However,
>>> I
>>> > can't understand why it would need so much memory. Could I have something
>>> > configured incorrectly? I've been over the configs several times, trying
>>> to
>>> > get them down to the bare minimum.
>>> >
>>> > Thanks for any assistance!
>>> >
>>> > Dave
>>>
>>>
>>>
>>> --
>>> lucidimagination.com
>>>
>
>
>
> --
> lucidimagination.com
--
Lance Norskog
goks...@gmail.com
n3.nabble.com/Indexing-HTML-files-in-SOLR-tp896530p896530.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Lance Norskog
goks...@gmail.com
t; Is there a token filter which do the same job as
>>> MappingCharFilterFactory but after tokenizer, reading the
>>> same config file?
>>
>> No, closest thing can be PatternReplaceFilterFactory.
>>
>> http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternReplaceFilterFactory.html
>>
>>
>>
>
>
--
Lance Norskog
goks...@gmail.com
index. With two indexes
from two sources, the terms in the documents will not have the same
"fingerprint". Relevance scores from one shard will not match the
meaning of a document's score in the other shard.
There is a project to make this work in Solr, but it is not nearly finished.
before starting a new
> development, we want to be sure that we are not doing anything wrong
> in the solr configuration or in the index generation.
>
> Any help would be appreciated.
> Regards,
> Matteo
>
--
Lance Norskog
goks...@gmail.com
and still allow me to use all
> the rest of the features of solr.
>
>
>
--
Lance Norskog
goks...@gmail.com
t to abort the process doesn’t really work. Does
> anyone know what’s happening here? Thanks!
>
> Wen
>
--
Lance Norskog
goks...@gmail.com
id doesn't do "query parser
> escaping" ... mainly because it has no way of knowing which query parser
> you are using.
>
>
> -Hoss
>
>
--
Lance Norskog
goks...@gmail.com
https://issues.apache.org/jira/browse/LUCENE-1812
On Fri, Jun 18, 2010 at 7:26 PM, Otis Gospodnetic
wrote:
> Lance, which project in Solr are you referring to?
>
>
> Thanks,
>
> Otis
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem
Ah! You need a SolrJ program that uses Tika to parse the files and
upload the text. I think there is such a program already but do not
know where it is.
Lance
On Thu, Jun 17, 2010 at 6:13 AM, seesiddharth wrote:
>
> Thank you so much for the reply...The link suggested by you is helpf
Solr depends on Lucene's implementation of queries and how it returns
document hits. I can't help you architect these changes.
On Mon, Jun 21, 2010 at 7:47 AM, sarfaraz masood
wrote:
> Mr Lance
>
> Thanks
> a lot for ur reply.. I am a novice a solr / lucene. but
No, this is basic to how Lucene works. You will need larger EC2 instances.
On Mon, Jun 21, 2010 at 2:08 AM, Matteo Fiandesio
wrote:
> Compiling solr with lucene 2.9.3 instead of 2.9.1 will solve this issue?
> Regards,
> Matteo
>
> On 19 June 2010 02:28, Lance Norskog wrote
he result only have "ID". The field "type"
> disappeared. I need that "type" to know what the "ID" refer to. Why solr
> "eat" my "type"?
>
>
> Thanks.
> Regards.
> Scott
>
--
Lance Norskog
goks...@gmail.com
blems with this, and git is a lifesaver for
playing with patches etc.
Lance
On Wed, Jun 23, 2010 at 8:03 AM, Erick Erickson wrote:
> Did you see this page?"
> http://wiki.apache.org/solr/HowToContribute
>
> <http://wiki.apache.org/solr/HowToContribute>Especially down
:
>
> solrconfig.xml
>
>
> data-config.xml
>
>
> Hope this helps.
>
> - Robert Zotter
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/DIH-and-dynamicField-tp917823p918189.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Lance Norskog
goks...@gmail.com
Hi,
I am trying to get db indexing up and running, but I am having trouble
getting it working.
In the solrconfig.xml file, I added
data-config.xml
I defined a couple of fields in schema.xml
media_id is defined as the unique
How do I know if solr is actually loading my database driver properly? I
added the mysql connector to the solr/lib directory, I added to the solrconfig.xml just to be sure it would find the
connector. When I start the application, I see it loaded my dataImporter
data config, but when I try to acce
Yes, it is registered exactly as you indicated in solrconfig and when the
application starts up, I can see a message indicating the data-config is
loaded successfully. So although the data config is loaded successfully, I
cannot seem to access the dataimport handler.
Regards,
L. Hill
-Origin
at
> org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:839)
> at
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:250)
> at
> org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283)
> at
> org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166)
>
--
Lance Norskog
goks...@gmail.com
is used.
>
> Am I doing something wrong or is Solr not truly completely RESTful?
>
> thanks,
>
>
> Jason
>
--
Lance Norskog
goks...@gmail.com
Solr supports multi-valued fields. You can add various skills to one
field and it will store all of the values in order. You can search on
any of the values. For numbers, you might want a subtype_value
convention: skillYears1_9 as one of the values for the skillYears
field.
Lance
On Mon, Jun 28
The 'bind error' means that you already had another Solr running. Use
'jps' to find all of the processes called 'start.jar' and kill them.
Lance
On Mon, Jun 28, 2010 at 2:36 PM, Lance Hill wrote:
> Hi,
>
>
>
> I am trying to get db indexing up and
this but I could not find the answer.
>> How can we know the required memory when facets are used so that I try to
>> scale my server/index correctly to handle it.
>>
>> Thanks
>>
>> Olivier
>>
>
--
Lance Norskog
goks...@gmail.com
ible
>> >> until you force the SOLR reader to reopen.
>> >>
>> >> HTH
>> >> Erick
>> >>
>> >> On Mon, Jun 28, 2010 at 6:49 PM, Peter Spam wrote:
>> >>
>> >>> On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote:
>> >>>
>> >>>>> 1) I can get my docs in the index, but when I search, it
>> >>>>> returns the entire document. I'd love to have it only
>> >>>>> return the line (or two) around the search term.
>> >>>>
>> >>>> Solr can generate Google-like snippets as you describe.
>> >>>> http://wiki.apache.org/solr/HighlightingParameters
>> >>>
>> >>> Here's how I commit my documents:
>> >>>
>> >>> J=0;
>> >>> for i in `find . -name \*.txt`; do
>> >>> (( J++ ))
>> >>> curl "http://localhost:8983/solr/update/extract?literal.id=doc$J";
>> >>> -F "myfi...@$i";
>> >>> done;
>> >>>
>> >>> echo "- Committing"
>> >>> curl "http://localhost:8983/solr/update/extract?commit=true";
>> >>>
>> >>>
>> >>> Then, I try to query using
>> >>>
>> http://localhost:8983/solr/select?rows=10&start=0&fl=*,score&hl=true&q=testing
>> >>> but I only get back the document ID rather than the snippet:
>> >>>
>> >>>
>> >>> 0.05030759
>> >>>
>> >>> text/plain
>> >>>
>> >>> doc16
>> >>>
>> >>>
>> >>> I'm using the schema.xml from the "lucid imagination: Indexing text and
>> >>> html files" tutorial.
>> >>>
>> >>>
>> >>>
>> >>> -Pete
>> >>>
>> >
>>
>>
>
--
Lance Norskog
goks...@gmail.com
com/Cache-hits-exposed-by-API-tp930602p930696.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Lance Norskog
goks...@gmail.com
e.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487)
>>>> at
>>>> org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114)
>>>> at
>>>> org.apache.commons.httpclient.Ht
how efficient and yet simple
> SOLR's (and Lucene's) query and response language (incl. response
> formats) is. Some things seem complex/difficult at first (like dismax or
> function queries) but turn out to be simple/easy to use considering the
> complexity of the problems they solve.
>
> Chantal
>
>
--
Lance Norskog
goks...@gmail.com
gt; store those snapshots, so we'd be pulling it over the wire only to write it
> right next to the original index. If we didn't have these HA clustering
> mechanisms available already, then I'm sure I'd be much more willing to look
> at a Solr master+slave architecture. But since we do, it seems like I'm a
> little bit hamstrung to use Solr's mechanisms anyway. So, that's my
> scenario, comments welcome. :)
>
> -dKt
>
>
>
>
--
Lance Norskog
goks...@gmail.com
edField faceting, the fieldType won't matter much at
> all for the space it takes up.
>
> The key here is that it looks like the number of unique terms in these
> fields is low - you would probably do much better with
> facet.method=enum (which iterates over terms rather than documents).
>
> -Yonik
> http://www.lucidimagination.com
>
--
Lance Norskog
goks...@gmail.com
I've looked at the problem. It's fairly involved. It probably would
take several iterations. (But not as many as field collapsing :)
On Wed, Jun 30, 2010 at 2:11 PM, Yonik Seeley
wrote:
> On Wed, Jun 30, 2010 at 4:55 PM, Lance Norskog wrote:
>> Apparently this is not ReStFuL
1 - 100 of 1511 matches
Mail list logo