i can't delete 1s ,2s ...etc from my
field value , i have to keep text in
this format... so i'll apply slop in my search to do my
needed search done.
It is OK if you cant delete 1s, 2s, etc from field value. We can eat up
those special markups in analysis chain.
Hi Kogi ,
Thanks for reply.
I tried by adding BoundaryScanner in my solrconfig.xml and set
hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and
termOffsets=on. in my query. then also i didn't get any effect on my
highlighting.
my solr config setting is as below
Hi Kogi ,
Thanks for reply.
I tried by adding BoundaryScanner in my solrconfig.xml and set
hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and
termOffsets=on. in my query. then also i didn't get any effect on my
highlighting.
my solr config setting is as below
I tried by adding BoundaryScanner in my
solrconfig.xml and set
hl.useFastVectorHighlighter=true, termVectors=on,
termPositions=on and
termOffsets=on. in my query. then also i didn't get any
effect on my
highlighting.
do i missing anything , or doing anything wrong??
i like to make a
(11/12/28 17:08), Ahmet Arslan wrote:
FastVectorHighlighter requires Solr3.1
http://wiki.apache.org/solr/HighlightingParameters#hl.useFastVectorHighlighter
Right. In addition, baoundaryScanner requires 3.5.
koji
--
http://www.rondhuit.com/en/
The SignatureUpdateProcessor is for exactly this problem:
http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/Deduplication
On Tue, Dec 27, 2011 at 10:42 PM, Alexander Aristov
alexander.aris...@gmail.com wrote:
I get docs from external sources and the only place I keep
You would have to implement this yourself in your indexing code. Solr
has an analysis plugin which does the analysis for your text and then
returns the result, but does not query or index. You can use this to
calculate the fuzzy hash, then search against index.
You might be able to code this in
Dear list,
I'd like to bounce on that issue...
IMHO, configuration parsing could be a little bit stricter... At least,
what stands for a severe configuration error could be user-defined.
Let me give some examples that are common errors and that don't trigger
the abortOnConfigurationError
Thanks for your reply, I thought about using the debug mode, too, but
the information is not easy to parse and doesnt contain everything I
want. Furthermore I dont want to enable debug mode in production.
Is there anything else I could try?
On Tue, Dec 27, 2011 at 12:48 PM, Ahmet Arslan
Hi all.
Due to my code review, I discovered next things:
1) as I wrote before, seems there is a low disk read speed;
2) at ~/solr-3.5/solr/core/src/java/org/apache/solr/response/XMLWriter.java
and in the same classes there is a writeDocList = writeDocs method, which
contains a cycle for of all
Thans iorixxx and Koji for your reply ,
so can i fulfill my needed requirement by using hl.regex.pattern and making
hl.fragmenter=regex ??
i was watching on these fields on wiki. i am thinking to use it to make my
highlighted text show in my desire format.
my string is like below
1s: This is
the problem with dedupe (SignatureUpdateProcessor ) is that it REPLACES old
docs. I have tried it already.
Best Regards
Alexander Aristov
On 28 December 2011 13:04, Lance Norskog goks...@gmail.com wrote:
The SignatureUpdateProcessor is for exactly this problem:
The issue i'm facing is... I didn't get the expected results when i combine
group param and sort param.
The query is...
http://localhost:8080/solr/core1/select/?qt=nutchq=*:*fq=userid:333group=truegroup.field=threadidgroup.sort=date%20descsort=date%20desc
where threadid is a hexadecimal string
http://lucene.472066.n3.nabble.com/file/n3616191/02.xml 02.xml
i am trying to index this file for this i am using this command
java -jar post.jar *.xml
commands run fine but when i search not result is displaying
I think it is encoding problem can any one help ??
--
View this
Hi,
Thanks a lot guys. I tried the following options
1.) Downloaded the solr 3.5.0 version and updated the schema.xml file with
the sample fields i have. I then tried to set the property
ignoreCaseForWildcards=true for a field type as mentioned in the url given
for the patch-2438, but got the
Thanks community! That helps!
To check practically, I have now setup Solr 3.5 in test environment. Few
observations on that,
1. I simply copy-pasted one of the Solr 1.4 instance on Solr 3.5 setup
(after correcting schema.config and solr.config files based on what is
suited for 3.5). If
http://lucene.472066.n3.nabble.com/file/n3616191/02.xml
02.xml
i am trying to index this file for this i am using this
command
java -jar post.jar *.xml
commands run fine but when i search not result is
displaying
I think it is encoding problem can any one help ??
Could it be a commit you're needing?
curl 'localhost:8983/solr/update?commit=true'
/Martin
On Wed, Dec 28, 2011 at 11:47 AM, mumairshamsi mumairsha...@gmail.comwrote:
http://lucene.472066.n3.nabble.com/file/n3616191/02.xml 02.xml
i am trying to index this file for this i am using
I must be missing something here. Why would this be any different from
any other singleton? I just did a little experiment where I implemented
the classic singleton pattern in a RequestHandler and accessed
from a Filter (both plugins) with no problem at all, just the usual
blah var =
There's no easy/efficient way that I know of to do this. Perhaps a good
question is what value-add this is going to make for your app and is
there a better way to convey this information. For instance, would
highlighting convey enough information to your user?
You're right that you don't want to
Well, the short answer is that nobody else has
1 had a similar requirement
AND
2 not found a suitable work around
AND
3 implemented the change and contributed it back.
So, if you'd like to volunteer G.
Seriously. If you think this would be valuable and are
willing to work on it, hop on over
Right, you were mislead by the discussion in for that patch,
the option you specified was NOT how the patch was
eventually implemented. Try reading this page instead:
http://wiki.apache.org/solr/MultitermQueryAnalysis
The short form is that with 3.6 (i.e. 3.x at this point) you
may not have to do
Thanks Eric,
it sets me direction. I will be writing new plugin and will get back to the
dev forum with results and then we will decide next steps.
Best Regards
Alexander Aristov
On 28 December 2011 18:08, Erick Erickson erickerick...@gmail.com wrote:
Well, the short answer is that nobody
On Wed, Dec 28, 2011 at 5:47 AM, ku3ia dem...@gmail.com wrote:
So, based on p.2) and on my previous researches, I conclude, that the more
documents I want to retrieve, the slower is search and main problem is the
cycle in writeDocs method. Am I right? Can you advice something in this
Hello Alexander,
I don't know much about your requirements in terms of size and
performances, but I've had a similar use case and found a pretty simple
workaround.
If your duplicate rate is not too high, you can have the
SignatureProcessor to generate fingerprint of documents (you already did
Hello,
I'm running Solr 3.5 on a XAMPP/Tomcat environment. It's working pretty good
for just one exception: when Solr remains idle without handling any requests
for about 5-10 mins the first request sent again will be delayed for a few
seconds. Subsequent requests are lightning-fast as usual. So
On Wed, Dec 28, 2011 at 8:52 PM, Odey mariofi...@googlemail.com wrote:
Hello,
I'm running Solr 3.5 on a XAMPP/Tomcat environment. It's working pretty good
for just one exception: when Solr remains idle without handling any requests
for about 5-10 mins the first request sent again will be
: That said, writing your own update request handler
: that detected this case isn't very difficult,
: extend UpdateRequestProcessorFactory/UpdateRequestProcessor
: and use it as a plugin.
i can't find the thread at the moment, but the general issue that has
caused people headaches with this
You really haven't posted enough details for people to guess as to what
your problem might be (in particuar: the actaul examples of your configs,
and any log messages during hte import)
please consult this wiki page and then post a followup with more
details...
: Can I use a XPathEntityProcessor in conjunction with an
: ExtractingRequestHandler? Also, the scripting language that
: XPathEntityProcessor uses/supports, is that just ECMA/JavaScript?
:
: Or is XPathEntityProcessor only supported for use in conjuntion with the
: DataImportHandler?
The
: Exception in thread main java.io.IOException: Job failed!
:
: at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
:
: at
:
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373)
:
: at
:
: I have a lots of files in my FTP account,and i use the curlftpfs to mount
: them to folder and then start index them with solrj api, but after a minutes
: pass something strange happen and the mounted folder is not accessible and
: crash,also i can not unmount it and the message device is in
: Of course. What I meant to say was there is
: always exactly one token in a non-tokenized
: field and it's offset is always exactly 0. There
: will never be tokens at position 1.
:
: So asking to match phrases, which is based on
: term positions is basically a no-op.
That's not always true.
I've seen in the solr faceting overview that it is possible to sort
either by count or lexicographically, but is there a way to sort so
the lowest counts come back first?
Hi,
I don't have an answer, but maybe I can help you if you provide more
information, for example:
- Which Solr version are you running?
- Which is the type of the date field?
- The output you are getting
- The output you expect
- Any other information that you consider relevant.
Thanks,
What else, if anything, do you have running on the server?
Because it's possible that pages are being swapped out
for other processes to use.
Solr itself shouldn't, as far as I know, time out anything so I
expect you're running into issues with the op system.
Best
Erick
On Wed, Dec 28, 2011 at
Hi Parvin,
You must also add the query parser definition to solrconfig.xml, for
example:
queryParser name=graph class=org.gasimzade.solr.GraphQParserPlugin/
*Juan*
On Wed, Dec 28, 2011 at 4:16 AM, Parvin Gasimzade
parvin.gasimz...@gmail.com wrote:
Hi all,
I have created custom Solr
On Wed, Dec 28, 2011 at 2:16 AM, Parvin Gasimzade
parvin.gasimz...@gmail.com wrote:
I have created custom Solr FunctionQuery in Solr 3.4.
I extended ValueSourceParser, ValueSource, Query and QParserPlugin classes.
Note that you only need a QParserPlugin implementation for top level
query types,
Right, I think that's what's happening here.
Google swapiness if you are on Linux.
Alternatively, one could add something to prevent the OS from swapping out
Solr's process. Here is how ElasticSearch does it, for example:
https://github.com/elasticsearch/elasticsearch/issues/464
Otis
: Is it possible that the system is running out of RAM, and swapping,
: or is aggressively swapping for some reason?
it doesn't have to be the solr /tomcat process memory getting swapped out
-- but that's certainly possible -- it could also be that the filesystem
cache is expunging the disk
(11/12/29 5:50), Jamie Johnson wrote:
I've seen in the solr faceting overview that it is possible to sort
either by count or lexicographically, but is there a way to sort so
the lowest counts come back first?
As far as I know, no. What is your use case?
koji
--
http://www.rondhuit.com/en/
I have a database where a user is searching for documents, and the
things which I'm faceting on are tags. Tags boil down to things of
interest, perhaps names, places, etc. The user in our case has asked
for the ability to change the ordering so they can easily find things
that appear very
: I've seen in the solr faceting overview that it is possible to sort
: either by count or lexicographically, but is there a way to sort so
: the lowest counts come back first?
Peter Sturge looked into this a while back and provided a patch, but there
were some issues with it that never got
: Subject: Sort facets by defined custom Collator
deja-vu...
http://www.lucidimagination.com/search/p:solr/s:email/l:user/sort:date?q=%22Facet+Ordering%22
-Hoss
On Tue, Dec 27, 2011 at 1:10 PM, Ahmet Arslan iori...@yahoo.com wrote:
To achieve this behavior, you can use StandardTokenizerFactory and
EdgeNGramFilterFactory and LowerCaseFilterFactory at index time.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory
Yes, the 3.5 Solr is opening and reading the Solr 1.4 index. When you
do a commit, it will rewrite the index in 3.5 format.
Doing a complete copy of the configs from 1.4 to 3.5 is easy, but
there are a lot of new features and changed defaults in the
solrconfig.xml file. These make indexing
Here is an example of schema design: a PDF file of 5MB might have
maybe 50k of actual text. The Solr ExtractingRequestHandler will find
that text and only index that. If you set the field to stored=true,
the 5mb will be saved. If saved=false, the PDF is not saved. Instead,
you would store a link
This copying is a bit overstated here because of the way that small
segments are merged into larger segments. Those larger segments are then
copied much less often than the smaller ones.
While you can wind up with lots of copying in certain extreme cases, it is
quite rare. In particular, if you
Unfortunately I have a lot of duplicates and taking that searching might
suffer I will try with implementing update procesor.
But your idea is interesting and I will consider it, thanks.
Best Regards
Alexander Aristov
On 28 December 2011 19:12, Tanguy Moal tanguy.m...@gmail.com wrote:
Hello
Yes I have been warned that query index each time before adding doc to
index might be resource consuming. Will check it.
As for the overwrite parameter I think the name is not the best then.
People outside the business like me misuse it and assume what I wrote.
Overwrite shall mean what it means.
It seems like my operation system was causing me trouble in some way. I
couldn't find what was triggering this issue, but after migrating the whole
project from wamp to lamp it has been resolved and everything is running
smoothly again.
Thank you very much for your help!
Regards,
--
View this
Alexander,
I have two ideas how to implement fast dedupe externally, assuming your PKs
don't fit to java.util.*Map:
- your crawler can use inprocess RDBMS (Derby, H2) to track dupes;
- if your crawler is stateless - it doesn't track PKs which has been
already crawled, you can retrieve
Hi Juan,
I'm using Solr 3.1
The type of the date field is long.
Let's say, the documents indexed in Solr server be..
doc
str name=uniqueid1326c5cc09bbc99a_1/str
str name=threadid1326c5cc09bbc99a/str
long name=date1316078009000/long
.. Some Other fields here ..
str name=subjectSome
Erick,
Ok. Let me try with plain java one. Possibly I'll need more tight
integration like injecting a core into the singleton, etc. But I don't know
yet.
Thanks for your efforts.
On Wed, Dec 28, 2011 at 5:48 PM, Erick Erickson erickerick...@gmail.comwrote:
I must be missing something here.
Thank you for your answers.
I have a MapdocId, score and want to boost the score of that documents
during search time.
In my example i get that map inside ValueSource and boost the matched
documents score.
In the query if {!graph} is added then it will return boosted query
otherwise it will
55 matches
Mail list logo