Hi all,
Announcing another Solr training course in Oslo, Norway June 1st-3rd.
This is the 3 day Developing Search Applications with Solr Lucid Imagination
course.
The training will be conducted in Norwegian.
For more information and sign-up, see www.solrtraining.com
--
Jan Høydahl, search
You'll have a hard time supporting stemming etc with this approach. Perhaps a
hybrid solution, querying across the all-languages field and a few selected
Language specific fields which receive proper linguistic treatment? qf=text_all
text_en^2.0 text_de^1.5
Jan Høydahl
On 27. mai 2010, at
Hi,
Is there a token filter which do the same job as MappingCharFilterFactory but
after tokenizer, reading the same config file?
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
Standard DisMax does not fully support explicit AND/OR.
You can prove that by trying to say q=fuel+OR+cell and see that the score stays
the same (given mm=100%)
It appears that DisMax does SOME intelligent handling of AND/OR/NOT, because it
adds the + on the AND and a - on the NOT. But adding a
Consider upgrading to the 3.1 branch which gives you true sort by function
http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
On 18. juni 2010, at 01.23, Chris Hostetter
Are you wanting to do thin on every single user query, and present to the end
user which words matched where? In that case debugQuery may be too much, and I
would look into creating a custom debugComponent optimized to only outputting
the core parts of the explain section that you need.
If
It would be nice to have, because sometimes you want to normalize accents and
other characters but want to wait until other filters have run. Especially if
those filters are dictionary based and therefore need the original word form.
Do you have a clue of how different a CharFilter is from a
Or simply use add(), because setParam overrides existing hashMap key:
solrQuery.setParam(stats.facet, fieldA);
solrQuery.add(stats.facet, fieldB);
solrQuery.add(stats.facet, fieldC);
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe -
Hi,
Sometimes I do both. I put the defaults in solrconfig.xml and thus have one
place to define all kind of low-level default settings.
But then I make a possibility in the application space to add/override any
parameters as well. This gives you great flexibility to let server
administrators
Hi,
You might also want to check out the new Lucene-Hunspell stemmer at
http://code.google.com/p/lucene-hunspell/
It uses OpenOffice dictionaries with known stems in combination with a large
set of language specific rules.
It handles your example, but it is an early release, so test it
Hi,
In DisMax the mm parameter controls whether terms are required or optional.
The default is 100% which means all terms required, i.e. you do not need to add
+. You can change to mm=0 and you will get the same behaviour as standard
parser, i.e. an OR behaviour, where the + would say that a
Hi pal :)
Unfortunately copyField works only BEFORE analysis and you cannot chain
them...
The simplest solution would be to duplicate your copyField's:
copyField source=title dest=textanayzemethod2 /
copyField source=body dest=textanayzemethod2 /
copyField source=title dest=textanayzemethod1
AS - www.cominvent.com
Training in Europe - www.solrtraining.com
On 29. juni 2010, at 14.02, Lukas Kahwe Smith wrote:
On 29.06.2010, at 13:38, Lukas Kahwe Smith wrote:
On 29.06.2010, at 13:24, Jan Høydahl / Cominvent wrote:
Hi,
In DisMax the mm parameter controls whether terms are required
Hi,
You need to use HTTP POST in order to send those parameters I believe. Try with
curl:
curl http://localhost:8983/solr/update?commit=true -H Content-Type: text/xml
--data-binary deletequeryuid:6-HOST*/query/delete
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Hmm, nice one - I was not aware of that trick.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
On 30. juni 2010, at 18.41, bbarani wrote:
Hi,
I was able to sucessfully delete multiple documents using the below URL
Hi,
I think I would look at a hybrid approach, where you keep adding new synonyms
to a query-side qynonym dictionary for immediate effect. And then every now and
then or every Nth night you move those synonyms over to the index-side
dictionary and trigger a full reindex.
A nice side effect of
Have you had a look at www.twigkit.com ? Could be worth the bucks...
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
On 1. juli 2010, at 00.59, Peter Spam wrote:
Wow, thanks Lance - it's really fast now!
The last piece of
Hi,
I have chosen the same approach as you, indexing content into text_language
fields with custom analysis, and it works great. Solr does not have any
overhead with this even if there are hundreds of languages, due to the
schema-less nature of Lucene.
And if you know which language is being
Hi,
I had the impression that the StreamingUpateSolrServer in SolrJ would
automatically use the /update/javabin UpdateRequestHandler. Is this not true?
Do we need to call
server.setRequestWriter(new BinaryRequestWriter()) for it to transmit content
with the binary protocol?
--
Jan Høydahl,
Hi,
Check out the new eDisMax handler (src) and the new pf2 parameter. Also
available as path SOLR-1553.
Another option to avoid match for doc2 is to add application specific logic in
your frontend which detects car brands and years and rewrite the query into a
phrase or a filter.
--
Jan
Hi,
I would rather go for the boolean variant and spend some time writing a query
parser which tries to understand all kinds of input people may make, mapping it
into boolean filters. In this way you can support both navigation and search
and keep both in sync whatever people prefert to start
Hi,
SolrJ uses slf4j logging. As you can read on the wiki
http://wiki.apache.org/solr/Solrj#Solr_1.4 you need to provide the slf4j-jdk14
binding (or any other log framework you wish to bind to) yourself and add the
jar to your classpath.
--
Jan Høydahl, search solution architect
Cominvent AS
The Char-filters MUST come before the Tokenizer, due to their nature of
processing the character-stream and not the tokens.
If you need to apply the accent normalizatino later in the analysis chain,
either use ISOLatin1AccentFilterFactory or help with the implementation of
SOLR-1978.
--
Jan
Check out slides 36-38 in this presentation for some hint on a possible
solution:
http://www.slideshare.net/janhoy/migrating-fast-to-solr-jan-hydahl-cominvent-as-euro-con
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
On 7.
What you are missing is a final
server.optimize();
Deleting a document will only mark it as deleted in the index until an
optimize. If disk space is a real problem in your case because you e.g. update
all docs in the index frequently, you can trigger an optimize(), say nightly.
--
Jan Høydahl,
Your use case can be solved by splitting the range into two int's:
Document: {title: My document, from: 8000, to: 9000}
Query: q=title:My AND (from:[* TO 8500] AND to:[8500 TO *])
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
Another way is to use DisMax parser, and give it a qf=field1 field2 field3...
parameter, and it will automatically search in all fields specified. It is more
powerful than having one default field, and saves that disk space. Buy you
sacrifice some extra resources during querying.
--
Jan
Hi,
Try solr.KeywordTokenizerFactory.
However, in your case it looks as if you have certain requirements for
searching that requires tokenization. So you should leave the
WhitespaceTokenizer as is and create a separate field specially for the
faceting, with indexed=true, stored=false and
Hi,
You don't need to duplicate the content into two fields to achieve this. Try
this:
q=mount OR mount*
The exact match will always get higher score than the wildcard match because
wildcard matches uses constant score.
Making this work for multi term queries is a bit trickier, but something
Hi,
Beware that post.jar is just an example tool to play with the default example
index located at /solr/ namespace. It is very limited and you shold look
elsewhere for a more production ready and robust tool.
However, it has the ability to specify custom url. Please try:
java -jar post.jar
Hi,
Since EMAIL_HEADER_FROM is a String type, you need to specify the whole field
every time. Wildcards could also work, but you'll get a problem with leading
wildcards.
The solution would be to change the fieldType into a text type using e.g.
StandardTokenizerFactory - if this does not break
Hi,
Yes, this is normal behavior. This is because Solr is *document* based, it does
not know about *files*.
What happens here is that your source database (or whatever) has had deletinons
within this category in addition to updates, and you need to relay those to
Solr.
The best way to
Hi,
Which time zone are you located in? Do you have DST?
Solr uses UTC internally for dates, which means that NOW will be the time in
London right now :) Does that appear to be right 4 u?
Also see this thread: http://search-lucene.com/m/hqBed2jhu2e2/
--
Jan Høydahl, search solution architect
Hi,
Make sure you use a proper ID field, which does *not* change even if the
content in the database changes. In this way, when your delta-import fetches
changed rows to index, they will update the existing rows in your index.
--
Jan Høydahl, search solution architect
Cominvent AS -
=mount OR mount* have different sorting order with q=mount for those
documents including mount.
Change to q=mount^100 OR (mount?* -mount)^1.0, and test well.
Thanks very much!
2010/8/10 Jan Høydahl / Cominvent jan@cominvent.com
Hi,
You don't need to duplicate the content into two
Have a look at www.splunk.com
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
On 11. aug. 2010, at 19.34, Jay Flattery wrote:
Hi there,
Just wondering what tools people use to analyse SOLR log files.
We're looking to
Your syntax looks a bit funny.
Which version of Solr are you using? Pure negative queries are not supported,
try q=(*:* -title:janitor) instead.
Also, for debugging what's going on, please add debugQuery=true and share the
parsed query for both cases with us.
--
Jan Høydahl, search solution
Hi,
You can try Tika command line to parse your Excel file, then you will se the
exact textual output from it, which will be indexed into Solr, and thus inspect
whether something is missing.
Are you sure you use a version of Luke which supports your version of Lucene?
--
Jan Høydahl, search
Use a tool to download a site to local disk, and ship the resulting HTML as a
folder or ZIP.
If that is not good enough, consider shipping the Reference Guide by
LucidImagination. It is one PDF and contains most of what you need. The
customer may be confused by LucidWorks specific chapters but
You can use the map() function for this, see
http://wiki.apache.org/solr/FunctionQuery#map
q=a
foxdefType=dismaxqf=allfieldsbf=map(query($qq),0,0,0,100.0)qq=allfields:(quick
AND brown AND fence)
This adds a constant boost of 100.0 if the $qq field returns a non-zero score,
which it does
If you want to change the schema on the live index, make sure you do a
compatible change, as Solr does not do any type checking or schema change
validation.
I would ADD a field with another name for the tint field.
Unfortunately you have to re-index to have an index built on this field.
May I
Hi,
Can you share with us how your schema looks for this field? What FieldType?
What tokenizer and analyser?
How do you parse the PDF document? Before submitting to Solr? With what tool?
How do you do the query? Do you get the same results when doing the query from
a browser, not SolrJ?
--
Jan
Some questions:
a) What operating system?
b) What Java container (Tomcat/Jetty)
c) What JAVA_OPTIONS? I.e. memory, garbage collection etc.
d) Example queries? I.e. what features, how many facets, sort fields etc
e) How do you load balance queries between the slaves?
f) What is your search latency
Hi,
I'm afraid you'll have to post the full document again, then do a commit.
But it WILL be lightning fast, as it is only the updated document which is
indexed, all the other existing documents will not be re-indexed.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Yes, I forgot that strings support alphanumeric ranges.
However, they will potentially be very memory intensive since you dont get the
trie-optimization and since strings take up more space than ints. Only way is
to try it out.
--
Jan Høydahl, search solution architect
Cominvent AS -
Høydahl / Cominvent jan@cominvent.com
To: solr-user@lucene.apache.org
Date: 18/08/2010 23:16
Subject: Re: Missing tokens
Cannot see anything obvious...
Try
http://localhost/solr/select?q=contents:OB10*
http://localhost/solr/select?q=contents:OB 10
http://localhost/solr
It is crucial to MEASURE your system to confirm your bottleneck.
I agree that you are very likely to be disk I/O bound with such little
memory left for the OS, a large index and many terms in each query.
Have your IT guys do some monitoring on your disks and log this while
under load. Then you
Hi,
You can place Solr wherever you want, but if your data is veery large, you'd
want dedicated box.
Have a look at DIH (http://wiki.apache.org/solr/DataImportHandler). It can both
crawl a file share periodically, indexing only files changed since a timestamp
(can be e.g. NOW-1HOUR) and
Check out the luke request handler:
http://localhost:8983/solr/admin/luke?fl=my_ad_fieldnumTerms=100 - you'll find
topTerms for the fields specified
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
On 20. aug. 2010, at 11.39,
Hi,
Try a wildcard term with lower score:
q=title:work AND title:work*debugQuery=true
You will now see from the debug printout that you get an extra boost for
workload.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com
On 22.
1. Currently we use Verity and have more than 20 collections, each collection
has a index for public items and a index for private items. So there are
virtual collections which point to each collection and a virtual collection
which points to all. For example, we have AA and BB collections.
(CompositeParser.java:119)
... 24 more
/pre
pRequestURI=/solr/lhcpdf/update/extract/ppismalla href=
http://jetty.mortbay.org/;Powered by Jetty:///a/small/i/pbr/
br/
***
-Original Message-
From: Jan Høydahl / Cominvent [mailto:jan@cominvent.com
Yes, this is really a pain sometimes.
I'd prefer a well defined base path, which could be assumed everywhere unless
otherwise documented.
SolrHome is one natural choice. For backward compat we could add a config in
solr(config).xml to easily switch to old behaviour.
Also, it makes sense to
?literal.collection=aaprivateliteral.id=doc1commit=true;
-F fi...@myfile.xml
Thanks so much as always!
Xiaohui
-Original Message-
From: Jan Høydahl / Cominvent [mailto:jan@cominvent.com]
Sent: Friday, August 27, 2010 7:42 AM
To: solr-user@lucene.apache.org
Subject: Re: how to deal
)
*
-Original Message-
From: Jan Høydahl / Cominvent [mailto:jan@cominvent.com]
Sent: Tuesday, August 31, 2010 2:15 PM
To: solr-user@lucene.apache.org
Subject: Re: how to deal with virtual collection in solr?
Hi,
If you have multiple cores defined in your solr.xml you need to issue
Hi,
This smells like a job for Hadoop and perhaps Mahout, unless your use cases are
totally ad-hoc research.
After Nutch has fetched the sites, kick off some MapReduce jobs for each case
you wish to study:
1. Extract phrases/contexts
2. For each context, perform detection and whitelisting
3. In
there's a lull.
Thank you, - Scott
On Fri, Sep 3, 2010 at 1:19 AM, Jan Høydahl / Cominvent
jan@cominvent.com wrote:
Hi,
This smells like a job for Hadoop and perhaps Mahout, unless your use cases
are totally ad-hoc research.
After Nutch has fetched the sites, kick off some MapReduce
Just attended a talk at JavaZone (www.javazone.no) by Stephen Colebourne about
JSR-310 which will make these kind of operations easier in future JDK, and how
Joda-Time goes a great way of enabling it today. I'm not saying it would fix
your GAP issue, as it's all about what definition of month
Hi,
May you show us what result you actually get? Wouldn't it make more sense to
choose a numeric fieldtype? To get proper sort order of numbers in a string
field, all number need to be exactly same length since order will be
lexiographical, i.e. 10 will come before 2, but after 02.
--
Jan
As Erick points out, you don't want a random doc as response!
What you're looking at is how to avoid the 0 hits problem.
You could look into one of these:
* Introduce autosuggest to avoid many 0-hits cases
* Introduce spellchecking
* Re-run the failed query with fuzzy turned on (e.g. alpha~)
*
Hi Tommaso,
Really cool what you've done. Looking forward to testing it, and I'm sure it's
a welcome contribution to Solr.
You can easily contribute your code by opening a JIRA issue and attaching a
patch file.
BTW
Have you considered making the output field names configurable on a per
Hi,
You could simply create an autocomplete Solr Core with a simple schema
consisting of id, from, to:
Let the fieldType of from be String, and in the fieldType of to you can use
StandardTokenizer, WordDelimiterFilter and EdgeNGramFilter.
add
doc
field
:-)
Also, that Wiki page clearly states in the very first line that it talks about
uncommitted stuff Solr4.0. I think that is pretty clear.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
On 22. sep. 2010, at 03.31, Lance Norskog wrote:
Developers, like marketers,
See this thread: http://search-lucene.com/m/FgbDS1JL3J1
Basically, what we normally do is to rename the fields with a language suffix,
so if you have language=en and text=A red fox, then you would index it as
text_en=A red fox. You would either have to do this outside Solr or write an
Solr will match this in version 3.1 which is the next major release.
Read this page: http://wiki.apache.org/solr/SolrCloud for feature descriptions
Coming to a trunk near you - see https://issues.apache.org/jira/browse/SOLR-1873
--
Jan Høydahl, search solution architect
Cominvent AS -
. 2010, at 10.44, Mike Thomsen wrote:
Interesting. So what you are saying, though, is that at the moment it
is NOT there?
On Mon, Sep 27, 2010 at 9:06 PM, Jan Høydahl / Cominvent
jan@cominvent.com wrote:
Solr will match this in version 3.1 which is the next major release.
Read this page
Hi,
Have anyone written any conditional functions yet for use in Function Queries?
I see the use for a function which can run different sub functions depending on
the value of a field.
Say you have three documents:
A: title=Sports car, color=red
B: title=Boring car, color=green
B: title=Big
Ok, I created the issues:
IF function: SOLR-2136
AND, OR, NOT: SOLR-2137
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
On 28. sep. 2010, at 19.36, Yonik Seeley wrote:
On Tue, Sep 28, 2010 at 11:33 AM, Jan Høydahl / Cominvent
jan@cominvent.com wrote:
Have
A follow-up on the auction use case.
How do you handle the need for frequent updates of only one field, such as the
last bid field (needed for sort on price, facets or range)?
For high traffic sites, the document update rate becomes very high if you
re-send the whole document every time the bid
Did you solve this?
If yes, what was wrong?
If no, can you specify one concrete example document and a matching query which
fails to highlight?
--
Jan Høydahl - search architect
Cominvent AS - www.cominvent.com
On 7. jan. 2010, at 15.23, Xavier Schepler wrote:
Erick Erickson a écrit :
It's
Hi,
There is no JOIN functionality in Solr. The common solution is either to accept
the high volume update churn, or to add client side code to build a join
layer on top of the two indices. I know that Attivio (www.attivio.com) have
built some kind of JOIN functionality on top of Solr in their
value for a field. You can only use functions on it.
On Sat, Jan 30, 2010 at 7:05 AM, Jan Høydahl / Cominvent
jan@cominvent.com wrote:
A follow-up on the auction use case.
How do you handle the need for frequent updates of only one field, such as
the last bid field (needed for sort
NOTE: Please start a new email thread for a new topic (See
http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking)
Your strategy could work. You might want to look into dedicated entity
extraction frameworks like
http://opennlp.sourceforge.net/
Much more efficient to tag documents with language at index time. Look for
language identification tools such as
http://www.sematext.com/products/language-identifier/index.html or
http://ngramj.sourceforge.net/ or
You may also want to play with other highlighting parameters to select how much
text to do highlighting on, how many fragments etc. See
http://wiki.apache.org/solr/HighlightingParameters
--
Jan Høydahl - search architect
Cominvent AS - www.cominvent.com
On 9. feb. 2010, at 13.08, Ahmet Arslan
Hi,
Index replication in Solr makes an exact copy of the original index.
Is it not possible to add the 6 extra fields to both instances?
An alternative to replication is to feed two independent Solr instances - full
control :)
Please elaborate on your specific use case if this is not useful
Hi,
To match 1, 2, 3, 4 below you could use a fieldtype based on TextField, with
just a simple WordDelimiterFactory. However, this would also match abc-def,
def.alpha, xyz-com and a...@def, because all punctuation is treated the same.
To avoid this, you could do some custom handling of -, .
Hi,
I'm using the /itas requestHandler, and would like to add spell-check
suggestions to the output.
I'm having spell-check configured and working in the XML response writer, but
nothing is output in Velocity. Debugging the JSON $response object, I cannot
find any representation of spellcheck
This sounds like an ideal use case for payloads. You could attach a boost value
to each term in your keywords field.
See
http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/
Another common workaround is to create, say, 8 multi-valued fields with boosts
0.5, 1.0, 1.5,
My point is that I WANT the AT, DOT to be indexed, to avoid these being treated
the same: foo-...@brown.fox and foo-bar.brown.fox
By using the LowerCaseFilterFactory before the replacements, you actually
ensure that a search for email:at will not give a match because the query will
be
09.02.2010 um 16:53 schrieb Jan Høydahl / Cominvent:
Hi,
Index replication in Solr makes an exact copy of the original index.
Is it not possible to add the 6 extra fields to both instances?
An alternative to replication is to feed two independent Solr instances -
full control :)
Please elaborate
Can you show us how you configured spell check?
--
Jan Høydahl - search architect
Cominvent AS - www.cominvent.com
On 10. feb. 2010, at 11.48, michaelnazaruk wrote:
Hello,all!
I have some problem with spellcheck! I download,build and connect
dictionary(~500 000 words)!It work fine! But i
How about a field indextime_dt filled with NOW. Then do a facet query to get
the montly stats last 12 months:
http://localhost:8983/solr/select/?q=*:*rows=0facet=truefacet.date=indextime_dtfacet.date.start=NOW/MONTH-12MONTHSfacet.date.end=NOW/MONTH%2B1MONTHfacet.date.gap=%2B1MONTH
To get min
Regarding hi-jacking, that was a false alarm. Apple Mail fooled me to believe
it was part of another thread. Sorry Jose.
I think the properties field approach is clean. It relies on index-time
classification which is where such heavy-lifting should preferrably be done.
Faceting on a
You did not say how frequent you need to update the index, if this is batch
type of operation or if you also have some real-time requirements after the
initial load.
Your ETL could use SolrJ and the StreamingUpdateSolrServer for high throughput.
You could try multiple threads pushing in
, just to be clear, isn't really JSON, it's a
toString() that looks similar though. Or did you convert it to JSON in some
other fashion? /itas?q=mispeledwt=json should also show the spelling
suggestions.
Erik
On Feb 9, 2010, at 7:30 PM, Jan Høydahl / Cominvent wrote:
Hi,
I'm
Can you show us your field definitions and the exact query string you are
using, and what you expect to see?
--
Jan Høydahl - search architect
Cominvent AS - www.cominvent.com
On 11. feb. 2010, at 15.31, adeelmahmood wrote:
hi there
i am trying to get familiar with solr while setting it
- Nutch
Hadoop ecosystem search :: http://search-hadoop.com/
- Original Message
From: Jan Høydahl / Cominvent jan@cominvent.com
To: solr-user@lucene.apache.org
Sent: Mon, February 8, 2010 3:33:41 PM
Subject: Re: Collating results from multiple indexes
Hi
Hi,
This is probably due to length normalization. Normally this is wanted, as you
want to penalize partial match vs a more exact match.
Try specifying omitNorms=true on your field.
You should ask yourself what kind of relevancy or sorting you really need in
your project. If you search short
Hi,
Have you tried playing with mergeFactor or even mergePolicy?
--
Jan Høydahl - search architect
Cominvent AS - www.cominvent.com
On 16. feb. 2010, at 08.26, Janne Majaranta wrote:
Hey Dipti,
Basically query optimizations + setting cache sizes to a very high level.
Other than that, the
After ZooKeeper is integrated (1.5?) there will be a way to get info about all
nodes in your cluster including their roles, status etc. Perhaps you want to
coordinate your dashboard effort with this version, although still very early
in development? See http://wiki.apache.org/solr/SolrCloud
--
have much to do with Lucene/SOLR
except where they integrate with the query execution. If you want to learn
more feel free to check out www.attivio.com.
- w...@attivio.com
On Fri, Feb 12, 2010 at 10:35 AM, Jan Høydahl / Cominvent
jan@cominvent.com wrote:
Really
Hi,
Does open for public mean end users through browser or web sites through API?
In either case you should have a front end proxying the traffic through to
Solr, which explicitly allows only parameters that you allow.
--
Jan Høydahl - search architect
Cominvent AS - www.cominvent.com
On 17.
A mature document processing pipeline, perhaps integration of
www.openpipeline.org which is Apache2.0 licensed
Hi Mark,
If (a) is wanted behaviour, i.e. have a business show up in facets for all
ZIPs, you should define a multi-valued ZIP field. Since a ZIP is a number, I
don't see any reason for any analysis on it, a String or a lightly normalized
field type would do the job both for search and facets.
Hi,
In current version you need to handle the cluster layout yourself, both on
indexing and search side, i.e. route documents to shards as you please, and
know what shards to search.
We try to address how to make this easier in
http://wiki.apache.org/solr/SolrCloud - have a look at it. The
Hi,
Yes, it will be a really nice package. I think the aim is to keep the ZK stuff
optional, which can be nice for small installs or upgrading without embracing
the ZK parts. All of this is still in the beginning of development.
Much of the cloud stuff is aimed at 1.5 but there are as usual no
Also, eDisMax query parser will be a welcome tool for these kinds of
requirements:
https://issues.apache.org/jira/browse/SOLR-1553
From the feature list: advanced stopword handling... stopwords are not
required in the mandatory part of the query but are still used (if indexed) in
the proximity
You probably don't want to include words in your dictionary which are not in
your index.
Have you tried Solr's feature to generate spellcheck dictionary from one or
more of your index fields?
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe -
Hi,
Sometimes you need to anchor your search to start/end of field.
Example:
1. title=New York Yankees
2. title=New York
3. title=York
If I search title:New York, or title:York I would get a match, but I'd like
to anchor my search to beginning and/or end of the field, e.g. with regex
syntax,
1 - 100 of 139 matches
Mail list logo