To Alex,
Thanks for you advice. I did ask and I can understand the requirement is
necessary for them. They won't browser all the result in one page. But they
will use the query result to do some additional research.
So what they want are something exact match the query. So they need to pull
out
Thanks Alex!
Yes, you hit my key points.
Actually I have to implement both of the requirements.
The first one works very well as the reason you state. Now I have a website
client which is 20 records per page. It is fast.
However, my customer also wants to use Servlet to download the whole query
I am sorry that I can't get your point. Would you explain a little more?
I am still struggling with this problem. It seems crash by no meaning
sometimes. Even I reduce to 5000 records each time, but sometimes it works
well with 1 per page.
--
View this message in context:
I am using Solr 3.5 and Jetty 8.12
I need to pull out huge query results at a time(for example, 1 million
documents, probably a couple gigabytes size) and my machine is about 64 G
memory.
I use the java bin and SolrJ as my client. And I use a Servelt to provide a
query down service for the end
Thanks for your suggestion. I will try later and give you a feedback if
possible
Now the way I use is to remove some ngram.
Thanks again!
--
View this message in context:
http://lucene.472066.n3.nabble.com/The-index-speed-in-the-solr-tp3931338p3939366.html
Sent from the Solr - User mailing list
You are very helpful. Thanks a lot!
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-can-I-get-the-top-term-in-solr-tp3926536p3931252.html
Sent from the Solr - User mailing list archive at Nabble.com.
It takes me 50 hours to index a total 9 G file(about 2,000,000 documents)
with n-gram filter from min=6,max=10, my token before ngram filter is
long(not a word, at most 300,000 bytes with white space). I split into 4
files and use the post.sh to update at the same time. I also tried to write
a
I have to discard this method at this time. Thank you all the same.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Further-questions-about-behavior-in-ReversedWildcardFilterFactory-tp3905416p3926423.html
Sent from the Solr - User mailing list archive at Nabble.com.
Actually I would like to know two meaning of the top term in document level
and index file level.
1.The top term in document level means that I would like to know the top
term frequency in all document(only calculate once in one document)
The solr schema.jsp seems to provide to top 10 term, but
I ask the question in
http://lucene.472066.n3.nabble.com/A-little-onfusion-with-maxPosAsterisk-tt3889226.html
However, when I do some implementation, I get a further questions.
1. Suppose I don't use ReversedWildcardFilterFactory in the index time, it
seems that Solr doesn't allow the leading
both are version 3.5
I have tried that the solr can read the index file by lucene,
but I tried to use the lucene to read the index file from a specific field.
It returns me the result when I do the *.* search
--
View this message in context:
Here are my fields
field name=id101/fieldfield name=sequenceNGHGJGKGKLHJFKGJGKGK/field
the sequence field is from 300 bytes to 56K bytes, no spaces
I want to ngram from 3 to 8
NGH GHG HGJ ...
NGHG GHGJ HGJG ...
...
fieldType name=nGram1 class=solr.TextField
positionIncrementGap=100
neosky wrote
I use the solr 3.5 version
1. It seems that the Ngramtokenizerfactory only token the first 1024
characters. I search the problem on the Internet, somebody had noticed the
bug in 2007, but I can't find the solution.
ps: my max field length has been modified
I use the solr 3.5 version
1. It seems that the Ngramtokenizerfactory only token the first 1024
characters. I search the problem on the Internet, somebody had noticed the
bug in 2007, but I can't find the solution.
ps: my max field length has been modified
maxFieldLength5/maxFieldLength
This
great! thanks!
--
View this message in context:
http://lucene.472066.n3.nabble.com/A-little-onfusion-with-maxPosAsterisk-tp3889226p3890776.html
Sent from the Solr - User mailing list archive at Nabble.com.
Because the first query result doesn't meet my requirement
I have to do a secondary process manually based on the first query full
results.
Only after I finish the secondary process, I begin to show it to the end
user based on specific records(for instance like the Solr does 10 records a
time)
one
maxPosAsterisk - maximum position (1-based) of the asterisk wildcard ('*')
that triggers the reversal of query term. Asterisk that occurs at positions
higher than this value will not cause the reversal of query term. Defaults
to 2, meaning that asterisks on positions 1 and 2 will cause a reversal.
1.I did 5 gram token in my sequence field, and I search as the following
http://192.168.52.137:8983/solr/select?indent=ondefType=dismaxversion=2.2q=sequence:N%20sequence:N%20sequence:Gfq=start=0rows=10fl=*,scoreqt=wt=explainOther=hl=onhl.fl=sequence
I want to return a document with
Does anyone know it is a bug or not?
I use Ngram in my index.
fieldType name=text_general_rev class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.NGramTokenizerFactory minGramSize=5
maxGramSize=5/
filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer
My current version is solr 3.5. It should be the most updated.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Why-my-highlights-are-wrong-one-character-offset-tp3860286p3862872.html
Sent from the Solr - User mailing list archive at Nabble.com.
all of my highlights has one character mistake in the offset,some fragments
from my response. Thanks!
response
lst name=responseHeader
int name=status0/int
int name=QTime259/int
lst name=params
str name=explainOther/
str name=indenton/str
str name=hl.flsequence/str
str name=wt/
str
all of my highlights has one character mistake in the offset,some fragments
from my response. Thanks!
response
lst name=responseHeader
int name=status0/int
int name=QTime259/int
lst name=params
str name=explainOther/
str name=indenton/str
str name=hl.flsequence/str
str name=wt/
str
Thanks! I look at the api carefully before, but not very sure.
So,it seems that the highlighter might not be helpful.
I am considering alternative solution for this problem.
I would like to what exactly want, for instance I got a candidate record
from my query:RVCES(I implement a 5 gram index)
Does the hightlight can provide the exact position of the query
For instance:
MSAQLRKPTA*RVCES*CGRAEHWDDDLEAWQIARTDGTKQVGSPHCLHEWDINGNFNPVAMDD
I want to know the Position of R in the highlight token.
I want to do the secondary query based on the position, Thanks!
--
View this message in context:
I am sorry, but I can't get what you mean.
I tried the HTMLStripCharFilter and PatternReplaceCharFilter. It doesn't
work.
Could you give me an example? Thanks!
fieldType name=text_html class=solr.TextField
positionIncrementGap=100
analyzer
charFilter
I use the xml to index the data. One filed might contains some characters
like '' =
It seems that will produce the error
I modify that filed doesn't index, but it doesn't work. I need to store the
filed, but index might not be indexed.
Thanks!
--
View this message in context:
Thanks!
Does the schema.xml support this parameter? I am using the example post.jar
to index my file.
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3825959.html
Sent from the Solr - User mailing list archive at
Thank you! Now I use the awk to preprocess it. It seems quite efficiency.I
think the other scripting languages will also be helpful.
Return to the post, I would like to know about whether the lucene support
the substring search or not.
As you can see, one field of my document is long string filed
Hello, I have a great challenge here. I have a big file(1.2G) with more than
200 million records need to index. It might more than 9 G file with more
than 1000 million record later.
One record contains 3 fields. I am quite newer for solr and lucene, so I
have some questions:
1. It seems that solr
29 matches
Mail list logo