Limit indexed documents.

2015-06-19 Thread tomas.kalas
Hello i have a few questions for indexing data.
Existing some hardware or software limits for indexing data?
And is some maximum of indexed documents?
Thanks for your answers.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Limit-indexed-documents-tp4212913.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tokenizer or Filter ?

2015-01-14 Thread tomas.kalas
I just used Solr UI Analyzer for my test, or must i indexed it firstly?

I used this XML code in my schema: 

fieldType name=direction1 class=solr.TextField
positionIncrementGap=100
analyzer
  charFilter class=solr.PatternReplaceCharFilterFactory
  pattern=lt;d1gt;.*lt;/d1gt; replacement=/
  tokenizer class=solr.KeywordTokenizerFactory/
/analyzer
  /fieldType

This is my result:
http://lucene.472066.n3.nabble.com/file/n4179496/dir1.png 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tokenizer-or-Filter-tp4178346p4179496.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tokenizer or Filter ?

2015-01-14 Thread tomas.kalas
Jack, thanks for help, but if i used PatternReplaceCharFilterFactory for
example for this :
d1text d1/d1d2text d2/d2d1text d1/d1d2text 2 ok/d2 then at
output i only get segment d2text 2 ok/d2 when is d2 text d2/d2
between marks d1 ./d1.d2.../d2 d1.../d1so the filter
probably takes only first d1 and last d1 and if is something between it so
the filter it don't skip it and replace it by space too, when i set at
replacement space. So not better used the update processor ? If you are
described it well in your book then i will buy it.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tokenizer-or-Filter-tp4178346p4179477.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tokenizer or Filter ?

2015-01-14 Thread tomas.kalas
Oh yeah, that is it. Thank you very much for your patience. And a last
question at the end what type regEx Solr actually using ? POSIX or PCRE ?
Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tokenizer-or-Filter-tp4178346p4179505.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tokenizer or Filter ?

2015-01-13 Thread tomas.kalas
Thanks Jack for your advice. Can you please explain me little more, how it
works? From Apache Wiki it's not to clear for me. I can write some
javaScript code when i want filtering some data ? In this case i have
d1bla bla bla/d1 d2 bla bla bla /d2 d1bla bla bla /d1 and i want
filtering d2 bla bla bla /d2, But in other case i want filtering all
d1  /d1 then i suppose i used it at indexed data and filtering from
them? Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tokenizer-or-Filter-tp4178346p4179173.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tokenizer or Filter ?

2015-01-09 Thread tomas.kalas
I'm used the same regex and it doesn't work unfortunately. Or should I
somehow change the regex? Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tokenizer-or-Filter-tp4178346p4178389.html
Sent from the Solr - User mailing list archive at Nabble.com.


Tokenizer or Filter ?

2015-01-09 Thread tomas.kalas
Hello, i have a question what i have to use tokenizer or filter ?
I need separate 2 chanels. I wrote this here earlier, but realize it with
solr basic tools it is not probably possible. And i',m trying to write own
tool for this task.
I have this input d1Hello/d1d2Hello/d2d1How are you ?/d1d2Fine
and you're?/d2 
d1 - direction1
d2 - direction2
and i want to output only d1 and between this result search some words, for
example output should be:
Output: [d1Hello/d1,d1How are you?/d1d1/d1] 

I wrote my idea in java, but i dont know where  to incorporate it. If to
Filter or Tokenizer and some advices how to start? I probably must extends
some lucene library and include it easily modificated there isn't it ?

Here is my code:

package test1;
import java.util.Arrays;

public class Test1 {


public static void main(String[] args) {
String dialogue = d1Hello/d1d2Hello/d2d1How are you
?/d1d2Fine and you're?/d2 ;

String[] input = dialogue.split((?=/d[12])\\d*(?=d[12]));

int countD1 = 0;

for (String input1 : input) {
if (input1.startsWith(d1)) {
countD1++;
}
}
String [] d1 = new String[countD1];
int array = 0;

for (String input1 : input) {
if (input1.startsWith(d1)) {
d1[array] = input1;
array++;
}
}
String d1Out = Arrays.toString(d1);
System.out.println(d1Out); 
//Return s1Out
 }
}

Thanks for you advices. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tokenizer-or-Filter-tp4178346.html
Sent from the Solr - User mailing list archive at Nabble.com.


Differentiate direction.

2014-12-18 Thread tomas.kalas
Hello,
is possible differentiate direction in one field?
I have a interview and i have there a tags d1Talking first person/d1
d2Talking second person/d2d1First person/d1d2Second person/d2
etc.

When i want search olny reply from first person.

Must i split on more fields, or should i use some delimiter by d1.../d1,
or any other solutions?

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Differentiate-direction-tp4174963.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Design optimal Solr Schema

2014-12-11 Thread tomas.kalas
Thanks for help, but how wrote Alex, I used synonm filter and it is what i
want. When i wrote to synonym for example Hello, Hi. And sentence is Hello
how are you and my query is Hi how are you, so that find it too.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Design-optimal-Solr-Schema-tp4166632p4173690.html
Sent from the Solr - User mailing list archive at Nabble.com.


Alternative synonymum

2014-12-11 Thread tomas.kalas
Hello, i want to searching in between transcripts of phone conversations. And
the machine which is make transcript the conversation to text is making some
alternatives. For example If we have sentence.
Hello how are you. 

1. Segment 
Hello  
Halo
Hollow

2.Segment
How
Bow


When i want for example search Halo how are you. So i for this example use
synonym filter.

For Hello set alternatives, Halo, Hollow ...

It works, but if is at next segments the same word with other alternatives,
for example How, Know, and i give it to synonym filter too at new line, then
it now have word How all alternatives How, Know, Bow and if i search Hello
Know, that found the sentence where is not Bow between alternatives too. 

In this case found the example sentence Hello how are you. First sentence
has at word how alternative bow, but from the next alternative word is save
value know too. 

Is possible treat this case, for example by the segments, when i know at 1
segment are specific words, use to in synonym. And at the further positions
is the same but with other segment number.

Thanks, i hope so you understand me, what i think.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Alternative-synonymum-tp4173694.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Design optimal Solr Schema

2014-12-11 Thread tomas.kalas
Oh no, i want to answered to this topic, where you help me with the synonym
filter:

http://lucene.472066.n3.nabble.com/Alternative-searching-td4172339.html

but i was opened this topic too and i checking my answer in google
translator and copy it here.

Now, i have a edit task, i do not have to search to specific time, but only
in phrase, but with alternatives. Synonym filter is good idea, but if i have
at specific word in more cases more altenatives, thats it the problem what i
now dealing. I asked in this topic:
http://lucene.472066.n3.nabble.com/Alternative-synonymum-td4173694.html

Sorry for chaos.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Design-optimal-Solr-Schema-tp4166632p4173748.html
Sent from the Solr - User mailing list archive at Nabble.com.


Alternative searching

2014-12-03 Thread tomas.kalas
Hello,
is possible searching by Solr search alternative words from some field?
For example if i want search some phrase from range:

At first position i want to have probably in one field hello,hi,cheerio.
At second my
At third name
At fourth is
At fifth Tomas, John, Paul.

And if i send query My Tomas~3 / My John~3 / Hi Paul~4 
So that always will find the required query.

Thanks.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Alternative-searching-tp4172339.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Alternative searching

2014-12-03 Thread tomas.kalas
Ok and how do you think how i get  data into to fields? And how it recognize
so how it is one term?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Alternative-searching-tp4172339p4172349.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Alternative searching

2014-12-03 Thread tomas.kalas
Its ok, when i use the example by synonym filter, so it wokrs, but i don´t
know how i have transfer this text to the schema.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Alternative-searching-tp4172339p4172356.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Alternative searching

2014-12-03 Thread tomas.kalas
I think something like that:

First Position Second Position Third Position
Fourth Position   Fift Position
-    --  
 ---
Hello   MyName  

Is  Paul
Hi  
  
Tomas
Cherio  
 
John

And it is like one sentence, and i thinks so it don¨t bee in more docs. 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Alternative-searching-tp4172339p4172361.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Design optimal Solr Schema

2014-10-31 Thread tomas.kalas
Thanks for your help.
Ok i try it explain one more, sorry for my english.
I need to some functions in my searching.

1.) I will have a lot of documents naturally and i want find out if is for
example is phrase for example to 5  words apart. I used w:Good morning~5.
(in example solr it works, but i don't know how do it at my project).

2.) Find some word(phrase) to a certain time, for example Good morning to
time 5.25

3.) And if it is possible order of the words. I'm using solarium client for
highlight and I want to highlight words in this order Hello How Are you for
example, then in this field are words *hello* you are * how are you* and if
the searching word is not in order, then skip it, but it not necessary,
primary i have problem with first 2 points. 

How i make ideal schema and parse data for source file.

I've done some demo with basic searching in one page i have form and results
are links at files by id (i have id as filename) and when i clicked at link
i set a parameter query and in result page i get a necessary data for
display result.

And result file is table with all rewrite interview whit highlighted results
.

Thanks for help.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Design-optimal-Solr-Schema-tp4166632p4166793.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Design optimal Solr Schema

2014-10-31 Thread tomas.kalas
Oh yes, i want to display stored data in html file. I have 2 pages, at one
page is form and i show here results.
Result here is link (by ID) at file where is all  conversation in second
page. And how did you mean sepparate each conversation interaction ? Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Design-optimal-Solr-Schema-tp4166632p4166805.html
Sent from the Solr - User mailing list archive at Nabble.com.


Design optimal Solr Schema

2014-10-30 Thread tomas.kalas
Hello i have problem with design of schema in Solr. I have a transcript of a
telephone conversation in this format. I parse it at individual fields. I
have this schema:

?xml version=1.0?
add
doc
field name=id01.cn/field
field name=t0br / 1br / 2br / 2 br / 3 br / /field
field name=st0.00br / 1.54br / 1.54br / 1.54 br / 1.57 br /
/field
field name=et1.54br / 1.54br / 1.57br / 1.57 br / 1.7 br /
/field
field name=w_SILENCE_br / sbr / HELLObr / HALLO br / _DELETE_
br / /field
field name=p0.00br / 1br / 1br / 2.06115e-009 br / 1 br /
/field
field name=c0br / 0br / 0br / 0 br / 0 br / /field
/doc
/add

I displayed it in html document, and therefore i used the br /.

This is a original document:

T=0 ST=0.00 ET=1.54 W=_SILENCE_ P=0.00 C=0
T=1 ST=1.54 ET=1.54 W=s P=1 C=0
T=2 ST=1.54 ET=1.57 W=HELLO P=1 C=0
T=2 ST=1.54 ET=1.57 W=HALLO P=2.06115e-009 C=0
T=3 ST=1.57 ET=1.70 W=_DELETE_ P=1 C=0
T=3 ST=1.57 ET=1.70 W=NO P=2.06115e-009 C=0
T=4 ST=1.70 ET=2.12 W=HOW P=1 C=0
T=5 ST=2.12 ET=2.18 W=ARE_ P=0.25 C=0
T=5 ST=2.12 ET=2.18 W=_DELETE_ P=0.25 C=0
..
..

Id - filename
T = Segment
ST = Start time
ET = End time
W = Word
P = Probability
C = Chanel

I want to search for example word which is to time 1.57 (w:HeLLO) AND (t:[0
TO 1.57]). But if i have all data in one field (t, st,et ...) then it
doesn't work. It find all files where is hello a further time than 1.57.

Do you have any ideas how it make it? Thanks a lot for your help.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Design-optimal-Solr-Schema-tp4166632.html
Sent from the Solr - User mailing list archive at Nabble.com.