RE: The book: Solr 4.x Deep Dive - Early Access Release #1

2013-06-21 Thread Swati Swoboda
I'd be willing to pay $30-$40 initial cost, but would expect to be able to get 
the revisions for no cost then.

With the revision model, I'd pay the initial $10 and then $3-$9 per revision 
(depending on what's in the revision). It's an interesting model then, because 
I can choose to not purchase sections/revisions that I am not interested in.

Thanks for doing this, Jack. Very excited!

Swati

-Original Message-
From: Stevo Slavić [mailto:ssla...@gmail.com] 
Sent: Friday, June 21, 2013 9:39 AM
To: solr-user
Subject: Re: The book: Solr 4.x Deep Dive - Early Access Release #1

Consider https://leanpub.com/ for publishing.

I'm in no way affiliated with them, just have positive personal buying
experience. One can and I regularly give more than what author requested as
min price. For such a work of 1k pages, would definitely pay more than $10.

Kind regards,
Stevo Slavic.


On Fri, Jun 21, 2013 at 3:32 PM, AJ Weber awe...@comcast.net wrote:



 On 6/21/2013 9:22 AM, Alexandre Rafalovitch wrote:


 I might be however confused regarding your strategy. I thought you
 were going to do several different volumes, rather than one large one.
 Or is this all a 'first' volume discussion so far.

 Pricing: $7.99 feels better for the book this size. Under $5 it feels
 like it may be mostly filler (even if it is not). I don't think
 anybody will pay every month just because it got updated.

 I agree that I'm a little confused as to the pricing.  Are you saying
 you'll keep updating it and everyone would just d/l the latest version
 monthly?  If so, what's to stop someone from waiting to subscribe until
 it is entirely complete and just pay the $8 once for the whole thing --
 versus those of us (me included) who would be sending our $8 every month
 and therefore receiving the same work at 10x the price (for example)?

 I'm with one of the previous responses:  I'd be willing to pay $30 for
 early-access (and updates) to an eBook as a one-time-cost and then when you
 release the final, set it at $40 or more.




RE: Issues in the Fuzzy Query !

2013-06-21 Thread Swati Swoboda
Hello,

Can you share the exact params you are passing to solr? 

Thanks

From: vibhoreng04 [vibhoren...@gmail.com]
Sent: June 21, 2013 9:27 AM
To: solr-user@lucene.apache.org
Subject: Issues in the Fuzzy Query !

Hi All,

I have been facing problems in the fuzzy queries.For an example if I query
((FIELDNAME1:FRANK~0.80) AND (FIELDNAME1:INDIANO~0.80))^0.80 , the parsed
query changes my distance grade to ~0 etc.Also in the other cases the
distance is changed in the pared query like ~1 and ~2.Can anyone tell me
what is the issue here .If there is any issue in the way I am querying ,I
would love to hear that.


str name=rawquerystring ((FIELDNAME1:FRANK~0.80) AND
(FIELDNAME1:INDIANO~0.80))^0.80/str
  str name=querystring ((FIELDNAME1:FRANK~0.80) AND
(FIELDNAME1:INDIANO~0.80))^0.80/str
  str name=parsedquery(+((+FIELDNAME1:FRANK~0
+FIELDNAME1:INDIANO~2)^0.8))/no_coord/str
  str name=parsedquery_toString+((+FIELDNAME1:FRANK~0
+FIELDNAME1:INDIANO~2)^0.8)/str
  lst name=explain/
  str name=QParserExtendedDismaxQParser/str
  null name=altquerystring/



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issues-in-the-Fuzzy-Query-tp4072125.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Steps for creating a custom query parser and search component

2013-06-20 Thread Swati Swoboda
Hi Juha,

If it's just a matter of format, have you considered adding another layer 
between Solr where you've got a class that just takes in your queries in the 
proprietary format and then converts them to what Solr needs? Similarly, if you 
need your results in a format, just convert them again? I would imagine that'd 
be a lot simpler than subclassing Solr classes.

Swati

-Original Message-
From: Juha Haaga [mailto:juha.ha...@codenomicon.com] 
Sent: Thursday, June 20, 2013 9:33 AM
To: solr-user@lucene.apache.org
Subject: Steps for creating a custom query parser and search component

Hello list followers,

I need to write a custom Solr query parser and a search component. The 
requirements for the component are that the raw query that may need to be split 
into separate Solr queries is in a proprietary format encoded in JSON, and the 
output is also going to be in a similar proprietary JSON format. I would like 
some advice on how to get started.

Which base classes should I start to work with? I have been looking at the 
plugin classes and my initial thoughts are along the lines of following 
workflow:

1. Subclass (QParser?) and write a new parser method that knows how to deal 
with the input format.
2. Subclass (SolrQueryRequestBase?) or use LocalSolrQueryRequest like in the 
TestHarness.makeRequest() and use it to execute the required queries.
3. Compile the aggregate results as specified in the query. 
4. Use some existing component (?) for returning the results to the user.
5. Put these components in steps 1-4 together into (?) so that it can be added 
to solrconfig.xml as a custom query parser accessible at 
http://solr/core/customparser

Is my approach reasonable, or am I overlooking some canonical way of achieving 
what I need to do? What and where do I need to look into to replace the 
question marks in my plan with knowledge? :)

-- Juha



RE: Question about SOLR search relevance score

2013-06-19 Thread Swati Swoboda
Hi Sergio,

Append 'debugQuery=on' to your queries to learn more about how your queries 
are being evaluated/ranked.

i.e. 
qf=attributes_name^15+attributes_brand^10+attributes_category^8debugQuery=on

You'll get an XML section that is dedicated to debug information.

I've found http://explain.solr.pl/ useful in understanding and visualizing the 
debug output.

Swati

-Original Message-
From: sérgio Alves [mailto:sd_t_al...@hotmail.com] 
Sent: Wednesday, June 19, 2013 11:45 AM
To: solr-user@lucene.apache.org
Subject: Question about SOLR search relevance score

Hi.





My name is Sérgio Alves and I'm a developer in a project that uses solr as its 
search engine.





Right now we're having problems with some common search terms. They 
return varied results on the search results, and the products which 
should appear first in the results, are scored lower than other, 
seemingly unrelated, products.





I wanted to know if there is a parameter or any possible way for me to 
know the way that solr calculates the scores it returns. For example, if
 we had a search relevancy formula like 
QF=attributes_name^15+attributes_brand^10+attributes_category^8, how can
 I know that brand scored 'x', for
 name 'y' and category 'z'. Is that possible? How can I do that?





This is urgent, if someone could take the time and answer this 
topic to me in a quick manner, I would really appreciate it.





Thank you very much for the attention, best regards,


Sérgio Alves  


RE: Note on The Book

2013-05-28 Thread Swati Swoboda
I'd definitely prefer the spiral bound as well. E-books are great and your 
draft version seems very reasonably priced (aka I would definitely get it). 

Really looking forward to this. Is there a separate mailing list / etc. for the 
book for those who would like to receive updates on the status of the book?

Thanks 

Swati Swoboda 
Software Developer - Igloo Software
+1.519.489.4120  sswob...@igloosoftware.com

Bring back Cake Fridays – watch a video you’ll actually like
http://vimeo.com/64886237


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Thursday, May 23, 2013 7:15 PM
To: solr-user@lucene.apache.org
Subject: Note on The Book

To those of you who may have heard about the Lucene/Solr book that I and two 
others are writing on Lucene and Solr, some bad and good news. The bad news: 
The book contract with O’Reilly has been canceled. The good news: I’m going to 
proceed with self-publishing (possibly on Lulu or even Amazon) a somewhat 
reduced scope Solr-only Reference Guide (with hints of Lucene). The scope of 
the previous effort was too great, even for O’Reilly – a book larger than 800 
pages (or even 600) that was heavy on reference and lighter on “guide” just 
wasn’t fitting in with their traditional “guide” model. In truth, Solr is just 
too complex for a simple guide that covers it all, let alone Lucene as well.

I’ll announce more details in the coming weeks, but I expect to publish an 
e-book-only version of the book, focused on Solr reference (and plenty of guide 
as well), possibly on Lulu, plus eventually publish 4-8 individual print 
volumes for people who really want the paper. One model I may pursue is to 
offer the current, incomplete, raw, rough, draft as a $7.99 e-book, with the 
promise of updates every two weeks or a month as new and revised content and 
new releases of Solr become available. Maybe the individual e-book volumes 
would be $2 or $3. These are just preliminary ideas. Feel free to let me know 
what seems reasonable or excessive.

For paper: Do people really want perfect bound, or would you prefer spiral 
bound that lies flat and folds back easily? I suppose we could offer both – 
which should be considered “premium”?

I’ll announce more details next week. The immediate goal will be to get the 
“raw rough draft” available to everyone ASAP.

For those of you who have been early reviewers – your effort will not have been 
in vain. I have all your comments and will address them over the next month or 
two or three.

Just for some clarity, the existing Solr Wiki and even the recent contribution 
of the LucidWorks Solr Reference to Apache really are still great contributions 
to general knowledge about Solr, but the book is intended to go much deeper 
into detail, especially with loads of examples and a lot more narrative guide. 
For example, the book has a complete list of the analyzer filters, each with a 
clean one-liner description. Ditto for every parameter (although I would note 
that the LucidWorks Solr Reference does a decent job of that as well.) Maybe, 
eventually, everything in the book COULD (and will) be integrated into the 
standard Solr doc, but until then, a single, integrated reference really is 
sorely needed. And, the book has a lot of narrative guide and walking through 
examples as well. Over time, I’m sure both will evolve. And just to be clear, 
the book is not a simple repurposing of the Solr wiki content – EVERY 
description of everything has been written fresh, from scratch. So, for 
example, analyzer filters get both short one-liner summary descriptions as well 
as more detailed descriptions, plus formal attribute specifications and 
numerous examples, including sample input and outputs (the LucidWorks Solr 
Reference does a better job with examples as well.)

The book has been written in parallel with branch_4x and that will continue.

-- Jack Krupansky


RE: Score after boost before

2013-04-05 Thread Swati Swoboda
http://explain.solr.pl/ might help you out with parsing out the response to see 
how boosts are affecting the scores. Take a look at some of the 
history/examples:

http://explain.solr.pl/explains/7kjl0ids


-Original Message-
From: abhayd [mailto:ajdabhol...@hotmail.com] 
Sent: Friday, April 05, 2013 12:37 PM
To: solr-user@lucene.apache.org
Subject: Re: Score after boost  before

we do that now, but thats very time consuming.

Also we want our QA to have that info available on search result page in debug 
mode.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Score-after-boost-before-tp4054052p4054102.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: How does solr 4.2 do in returning large datasets ?

2013-04-01 Thread Swati Swoboda
It really depends on what you are returning (how big is each document? Just a 
document ID? Pages and pages of data in fields?). 

It can take a long time for Solr to render an XML with 60,000 results. Solr 
will be serializing the data and then you'd (presumably) be de-serializing it. 
Depending on how big each field actually is, this could take a while or even 
cause DOS on your server.

Your client would also need a fair bit of memory to parse a document with 
60,000 results


-Original Message-
From: Liz Sommers [mailto:lizswo...@gmail.com] 
Sent: Monday, April 01, 2013 9:39 AM
To: solr-user
Subject: How does solr 4.2 do in returning large datasets ?

I thought I remembered reading that Solr is not good for returning large 
datasets.  We are currently using lucene 3.6.0 and returning datasets of
10,000 to 60,000 results.  In the future we might need to return even larger 
datasets.

Would you all recommend going to Solr for this, or should we stick with Lucene 
(which has given us no problems in this regard)?  I am a bit wary of using a 
web service to return datasets of this size.

Thanks a lot
Liz
lizswo...@gmail.com


RE: Is deltaQuery mandatory ?

2013-03-28 Thread Swati Swoboda
No, it's not mandatory. You can't do delta imports without delta queries 
though; you'd need to do a full-import. Per your query, you'd only ever do 
objects with rownum=5000.

-Original Message-
From: A. Lotfi [mailto:majidna...@yahoo.com] 
Sent: Thursday, March 28, 2013 10:07 AM
To: gene...@lucene.apache.org; solr-user@lucene.apache.org
Subject: Is deltaQuery mandatory ?

Is deltaQuery mandatory in data-config.xml ?

I did it like this :
entity name=residential query=select * from tsunami.consumer_data_01 where 
state='MA' and  rownum = 5000
                deltaQuery=select  LEMSMATCHCODE, STREETNAME from residential 
where last_modified  '${dataimporter.last_index_time}'
 
Then my manager come and said we don't need it, this is only for incremental.
I took off the line that start with deltaQuery, now in :

http://localhost:8983/solr/#/db/dataimport//dataimport


entity is empty, when I click the button Exwcute, nothing happened,

thanks.


RE: SOLR - Unable to execute query error - DIH

2013-03-28 Thread Swati Swoboda
What version of Solr4 are you running? We are on 3.6.2 so I can't be confident 
whether these settings still exist (they probably do...), but here is what we 
do to speed up full-indexing:

In solrconfig.xml, increase your ramBufferSize to 128MB.
Increase mergeFactor to 20.
Make sure autoCommit is disabled.

Basically, you want to minimize how often Lucene/Solr flushes (as that is very 
time consuming). Merging is also very time consuming, so you want large 
segments and fewer merges (hence the merge factor increase). We use these 
settings when we are doing our initial full-indexing and then switch them over 
to saner defaults do our regular/delta indexing.

Roll-backs concern me; why did your query roll back? Did it give an error -- it 
should have. Should be in your solr log file. Was it because the connection 
timed out? It's important to find out. We prevented roll backs by effectively 
splitting our data across entities and then indexing one-entity at a time. This 
allowed us to make sure that if one sector failed, it didn't impact the 
entire process. (This can be done by using autoCommit, but that slows down 
indexing.) 

If you're getting OOM errors, be sure that your Xmx value is set high enough 
(and that you have enough memory). You may be able to increase ramBufferSize 
depending on how much memory you had (we didn't have much). 

Hope this helps.
Swati


-Original Message-
From: kobe.free.wo...@gmail.com [mailto:kobe.free.wo...@gmail.com] 
Sent: Thursday, March 28, 2013 2:43 AM
To: solr-user@lucene.apache.org
Subject: RE: SOLR - Unable to execute query error - DIH

Thanks James.

We have tried the following options *(individually)* including the one you 
suggested,

1.selectMethod=cursor 
2. batchSize=-1
3.responseBuffering=adaptive

But the indexing process doesn't seem to be improving at all. When we try to 
index set of 500 rows it works well gets completed in 18 min. For 1000K rows it 
took 22 hours (long) for indexing. But, when we try to index the complete set 
of 750K rows it doesn't show any progress and keeps on executing.

Currently both the SQL server as well as the SOLR machine is running on 4 GB 
RAM. With this configuration does the above scenario stands justified? If we 
think of upgrading the RAM, which machine should that be, the SOLR machine or 
the SQL Server machine?

Are there any other efficient methods to import/ index data from SQL Server to 
SOLR?

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Unable-to-execute-query-error-DIH-tp4051028p4051981.html
Sent from the Solr - User mailing list archive at Nabble.com.


Contributors Group

2013-03-25 Thread Swati Swoboda
Hello,

Can I be added to the contributors group? Username sswoboda.

Thank you.

Swati


RE: Get page number of searchresult of a pdf in solr

2013-02-28 Thread Swati Swoboda
You can get the paragraph of the search result via highlights. You'd have to 
mark your field as stored (re-indexing required) and then specify it in the 
highlighting parameters. 

http://wiki.apache.org/solr/HighlightingParameters#hl

As for getting the page number, I am not sure if there is more you can do than 
what Michael suggested...



-Original Message-
From: d...@geschan.de [mailto:d...@geschan.de] 
Sent: Thursday, February 28, 2013 3:27 PM
To: solr-user@lucene.apache.org
Subject: Get page number of searchresult of a pdf in solr

Hello,

I'm building a web application where users can search for pdf documents and 
view them with pdf.js. I would like to display the search results with a short 
snippet of the paragraph where the search term where found and a link to open 
the document at the right page.

So what I need is the page number and a short text snippet of every search 
result.

I'm using SOLR 4.1 for indexing pdf documents. The indexing itself works fine 
but I don't know how to get the page number and paragraph of a search result. I 
only get the document where the search term was found in.

-Gesh



RE: POI error while extracting docx document

2013-02-26 Thread Swati Swoboda
Hey Carlos,

What version of Solr are you running and what version of openxml4j did you 
import?

Swati

-Original Message-
From: Carlos Alexandro Becker [mailto:caarl...@gmail.com] 
Sent: Tuesday, February 26, 2013 12:04 PM
To: solr-user
Subject: Re: POI error while extracting docx document

I've added the openxml4j jar to the project, still don't work. Which is the 
correct version?


On Tue, Feb 26, 2013 at 11:23 AM, Carlos Alexandro Becker  caarl...@gmail.com 
wrote:

 I made solr extract the files content. That's ok, but some files (like 
 .docx files) give me errors, while .pdf files index as expected.

 The error is:


 14:20:29,714 ERROR [org.apache.solr.servlet.SolrDispatchFilter]
 (http--0.0.0.0-8080-4) null:java.lang.RuntimeException:
 java.lang.NoSuchMethodError:
 org.apache.poi.openxml4j.opc.PackagePart.getRelatedPart(Lorg/apache/po
 i/openxml4j/opc/PackageRelationship;)Lorg/apache/poi/openxml4j/opc/Pac
 kagePart;
  at
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilte
 r.java:469)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter
 .java:297)
  at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli
 cationFilterChain.java:280)
 at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi
 lterChain.java:248)
  at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVa
 lve.java:275)
 at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextVa
 lve.java:161)
  at
 org.jboss.as.web.security.SecurityContextAssociationValve.invoke(Secur
 ityContextAssociationValve.java:153)
 at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.ja
 va:155)
  at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.ja
 va:102)
 at
 org.apache.catalina.authenticator.SingleSignOn.invoke(SingleSignOn.jav
 a:397)
  at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValv
 e.java:109)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java
 :368)
  at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:
 877)
 at
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.proces
 s(Http11Protocol.java:671)
  at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930
 ) at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.NoSuchMethodError:
 org.apache.poi.openxml4j.opc.PackagePart.getRelatedPart(Lorg/apache/po
 i/openxml4j/opc/PackageRelationship;)Lorg/apache/poi/openxml4j/opc/Pac
 kagePart;
 at
 org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEm
 beddedParts(AbstractOOXMLExtractor.java:121)
  at
 org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML
 (AbstractOOXMLExtractor.java:107)
 at
 org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOX
 MLExtractorFactory.java:112)
  at
 org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.j
 ava:82) at 
 org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
  at 
 org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
 at 
 org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:12
 0)
  at
 org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extra
 ctingDocumentLoader.java:219)
 at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Con
 tentStreamHandlerBase.java:74)
  at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandle
 rBase.java:129)
 at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleR
 equest(RequestHandlers.java:240)  at 
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.
 java:455)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter
 .java:276)
 ... 14 more



 Looks like a dependency issue. What dependency should I add to fix this?


 --
 Atenciosamente,
 *Carlos Alexandro Becker*
 http://caarlos0.github.com/about




--
Atenciosamente,
*Carlos Alexandro Becker*
http://caarlos0.github.com/about


RE: User Query Processing Sanity Check

2013-02-25 Thread Swati Swoboda
Maybe  I am not understanding correctly, but have you overlooked the qf 
parameter for Edismax?

http://wiki.apache.org/solr/ExtendedDisMax#qf_.28Query_Fields.29

Suppose you want to search for the phrase apples and bananas in title, 
summary, and body. You also want it to have greater emphasis when the search 
term is found in title and description. The way you would do it is:

q = apples and bananas
qf = title^100 content description^10

That's it. Now it'll search for apples and bananas in all 3 fields. Edismax 
was basically designed to do...what you want to do.  You'll probably also find 
the mm parameter and the pf parameters immensely useful.



-Original Message-
From: z...@navigo.com [mailto:z...@navigo.com] 
Sent: Monday, February 25, 2013 12:06 PM
To: solr-user@lucene.apache.org
Subject: User Query Processing Sanity Check

Have been working with Solr for about 6 months, straightforward stuff, basic 
keyword searches. We want to move to more advanced stuff, to support 'must 
include', 'must not include', set union, etc. I.e., more advanced query strings.

We seem to have hit a block, and are considering two paths and want to make 
sure we have the right understanding before wasting time. To wit:

- We have many fields to search, fieldA, fieldB, fieldC, etc.
- We need field level boosting, fieldA  fieldB  fieldC, etc.
- We're happy to use EDisMax query syntax: , +, -, OR, AND, (), and 
field:term superficial syntax.

Passing the query straight through doesn't seem work because foo bar 
fieldB:baz searches foo and bar in the default field only, but we want to 
search multiple fields. The trick of copying multiple fields into a single 
artificial default field seems to fail on the second requirement.

So, we end up parsing the Lucene syntax ourselves, and rebuilding the query my 
multiplying the fields so that:

foo bar fieldB:baz - (fieldA:foo OR fieldB:foo OR fieldC:foo) AND (fieldA:bar 
OR fieldB:bar OR fieldC:bar) AND (fieldB:baz)

Technically, this is straightforward enough, but it seems a shame since the 
EDisMax query parser seems like it's *almost* what we want, if it weren't for 
the reality of the singular default field.

Are we correct to build our own mini-parser that takes query strings and 
multiplies the fields for free-field sub-predicates? Or is there a simpler path 
that we're overlooking?

Regards,
Zane



--
View this message in context: 
http://lucene.472066.n3.nabble.com/User-Query-Processing-Sanity-Check-tp4042783.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Index data from multiple tables into Solr

2013-01-15 Thread Swati Swoboda
He is talking about this list, the list we are using to communicate. You are 
sending your messages to a mailing list -- thousands are on it.

Example of programs that will run the delta-import/full-import commands: Cron
You are basically calling a URL with specific parameters to pull data from your 
DB

Example of program that will use the Solr API: these are all application 
specific (based on what fields are in your schema, etc.). 

Swati

-Original Message-
From: hassancrowdc [mailto:hassancrowdc...@gmail.com] 
Sent: Tuesday, January 15, 2013 2:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Index data from multiple tables into Solr

Which list are you reffering to?

and can you please give an example of such program(doesn't matter if it is for 
your setup)?


On Tue, Jan 15, 2013 at 12:06 PM, Shawn Heisey-4 [via Lucene] 
ml-node+s472066n4033518...@n3.nabble.com wrote:

 On 1/15/2013 9:20 AM, hassancrowdc wrote:
  Hi,
  once i have indexed data from multiple tables from mysql database 
  into solr, is there any way that it update data(automatically) if 
  any change
 is
  made to the data in mysql?

 You need to write a program to do this.

 Although this list can provide guidance, such programs are highly 
 customized to the particulars for your setup.  There is not really any 
 general purpose solution here.

 There are two typical approaches - have a program that initiates 
 delta-imports with the dataimporter, or write a program that both 
 talks to your database and uses a Solr client API to send updates to 
 Solr.  I used to use the former approach, now I use the latter.  I 
 still use the dataimporter for full reindexes, though.

 Thanks,
 Shawn



 --
  If you reply to this email, your message will be added to the 
 discussion
 below:

 http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int
 o-Solr-tp4032266p4033518.html  To unsubscribe from Index data from 
 multiple tables into Solr, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro
 =unsubscribe_by_codenode=4032266code=aGFzc2FuY3Jvd2RjYXJlQGdtYWlsLmN
 vbXw0MDMyMjY2fC00ODMwNzMyOTM=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro
 =macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.n
 amespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabb
 le.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21na
 bble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_em
 ail%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-into-Solr-tp4032266p4033545.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Index data from multiple tables into Solr

2013-01-15 Thread Swati Swoboda
https://wiki.apache.org/solr/Solrj client. You'd have to configure it / use it 
based on your application needs.

-Original Message-
From: hassancrowdc [mailto:hassancrowdc...@gmail.com] 
Sent: Tuesday, January 15, 2013 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Index data from multiple tables into Solr

ok.
so if i have manufacturer and id fields in schema file, what will be wat will 
be program that will use that will use solr API?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-into-Solr-tp4032266p4033556.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Index data from multiple tables into Solr

2013-01-15 Thread Swati Swoboda
What error are you getting? Which field are you searching (default field)? Did 
you try specifying a default field? What is your schema like? Which analyzers 
did you use?

Which version of solr are you using? I highly recommend going through the 
tutorial to get a basic understanding of inserting, updating, and searching:

http://lucene.apache.org/solr/tutorial.html

Hours have been spent in setting up these tutorials and they are very 
informative.

-Original Message-
From: hassancrowdc [mailto:hassancrowdc...@gmail.com] 
Sent: Tuesday, January 15, 2013 3:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Index data from multiple tables into Solr

okay, thank you.

After indexing data from database to solr. I want to search such that if i 
write any word (that is included in the documents been indexed) it should 
return all the documents that include that word. But it does not. When i
write http://localhost:8983/solr/select?q=anyword   i gives me error.

is there anything wrong with my http? or is this the wrong place to search?


On Tue, Jan 15, 2013 at 2:48 PM, sswoboda [via Lucene] 
ml-node+s472066n4033563...@n3.nabble.com wrote:

 https://wiki.apache.org/solr/Solrj client. You'd have to configure it 
 / use it based on your application needs.

 -Original Message-
 From: hassancrowdc [mailto:[hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4033563i=0]

 Sent: Tuesday, January 15, 2013 2:38 PM
 To: [hidden email] 
 http://user/SendEmail.jtp?type=nodenode=4033563i=1
 Subject: Re: Index data from multiple tables into Solr

 ok.
 so if i have manufacturer and id fields in schema file, what will be 
 wat will be program that will use that will use solr API?




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int
 o-Solr-tp4032266p4033556.html

 Sent from the Solr - User mailing list archive at Nabble.com.


 --
  If you reply to this email, your message will be added to the 
 discussion
 below:

 http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int
 o-Solr-tp4032266p4033563.html  To unsubscribe from Index data from 
 multiple tables into Solr, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro
 =unsubscribe_by_codenode=4032266code=aGFzc2FuY3Jvd2RjYXJlQGdtYWlsLmN
 vbXw0MDMyMjY2fC00ODMwNzMyOTM=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro
 =macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.n
 amespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabb
 le.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21na
 bble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_em
 ail%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-into-Solr-tp4032266p4033614.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Index data from multiple tables into Solr

2013-01-15 Thread Swati Swoboda
http://wiki.apache.org/solr/ExtendedDisMax

Specify your query fields in the qf parameter. Take a look at the example at 
the bottom of the page.



-Original Message-
From: hassancrowdc [mailto:hassancrowdc...@gmail.com] 
Sent: Tuesday, January 15, 2013 3:56 PM
To: solr-user@lucene.apache.org
Subject: Re: Index data from multiple tables into Solr

I dont want to search by one field, i want to search as a whole. I am following 
that tutorial i got indexing, updating but now for search i would like to 
search through everything i have indexed not a specific field. I can do by 
using defaultfield but i would like to search through everything i have 
indexed. any hint how i can do that?


On Tue, Jan 15, 2013 at 3:49 PM, sswoboda [via Lucene] 
ml-node+s472066n4033617...@n3.nabble.com wrote:

 What error are you getting? Which field are you searching (default field)?
 Did you try specifying a default field? What is your schema like? 
 Which analyzers did you use?

 Which version of solr are you using? I highly recommend going through 
 the tutorial to get a basic understanding of inserting, updating, and
 searching:

 http://lucene.apache.org/solr/tutorial.html

 Hours have been spent in setting up these tutorials and they are very 
 informative.

 -Original Message-
 From: hassancrowdc [mailto:[hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4033617i=0]

 Sent: Tuesday, January 15, 2013 3:38 PM
 To: [hidden email] 
 http://user/SendEmail.jtp?type=nodenode=4033617i=1
 Subject: Re: Index data from multiple tables into Solr

 okay, thank you.

 After indexing data from database to solr. I want to search such that 
 if i write any word (that is included in the documents been indexed) 
 it should return all the documents that include that word. But it does not. 
 When i
 write http://localhost:8983/solr/select?q=anyword   i gives me error.

 is there anything wrong with my http? or is this the wrong place to 
 search?


 On Tue, Jan 15, 2013 at 2:48 PM, sswoboda [via Lucene]  [hidden 
 email] http://user/SendEmail.jtp?type=nodenode=4033617i=2
 wrote:

  https://wiki.apache.org/solr/Solrj client. You'd have to configure 
  it / use it based on your application needs.
 
  -Original Message-
  From: hassancrowdc [mailto:[hidden
  email]http://user/SendEmail.jtp?type=nodenode=4033563i=0]
 
  Sent: Tuesday, January 15, 2013 2:38 PM
  To: [hidden email]
  http://user/SendEmail.jtp?type=nodenode=4033563i=1
  Subject: Re: Index data from multiple tables into Solr
 
  ok.
  so if i have manufacturer and id fields in schema file, what will be 
  wat will be program that will use that will use solr API?
 
 
 
 
  --
  View this message in context:
  http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-i
  nt
  o-Solr-tp4032266p4033556.html
 
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
  --
   If you reply to this email, your message will be added to the 
  discussion
  below:
 
  http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-i
  nt o-Solr-tp4032266p4033563.html  To unsubscribe from Index data 
  from multiple tables into Solr, click 
  herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?mac
  ro 
  =unsubscribe_by_codenode=4032266code=aGFzc2FuY3Jvd2RjYXJlQGdtYWlsL
  mN
  vbXw0MDMyMjY2fC00ODMwNzMyOTM=
  .
  NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?mac
  ro 
  =macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml
  .n 
  amespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-na
  bb 
  le.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21
  na 
  bble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_
  em
  ail%21nabble%3Aemail.naml
 




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int
 o-Solr-tp4032266p4033614.html

 Sent from the Solr - User mailing list archive at Nabble.com.


 --
  If you reply to this email, your message will be added to the 
 discussion
 below:

 http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int
 o-Solr-tp4032266p4033617.html  To unsubscribe from Index data from 
 multiple tables into Solr, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro
 =unsubscribe_by_codenode=4032266code=aGFzc2FuY3Jvd2RjYXJlQGdtYWlsLmN
 vbXw0MDMyMjY2fC00ODMwNzMyOTM=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro
 =macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.n
 amespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabb
 le.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21na
 bble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_em
 ail%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-into-Solr-tp4032266p4033622.html
Sent from the 

RE: Reading properties in data-import.xml

2013-01-10 Thread Swati Swoboda
I am on 3.6 and this is my setup:

Properties file under solr.home, so right under /jetty/solr
solr.xml modified as follows:

core name=corename instanceDir=instancedir 
properties=../solrcore.properties /

http://wiki.apache.org/solr/CoreAdmin#property - the path is relative to 
instancedir


Your syntax is correct in DIH, I think all you are missing is the reference to 
the property file in solr.xml.

-Original Message-
From: Dariusz Borowski [mailto:darius...@gmail.com] 
Sent: Thursday, January 10, 2013 10:38 AM
To: solr-user@lucene.apache.org
Subject: Re: Reading properties in data-import.xml

Thanks Alexandre!

I followed your example and created a solrcore.properties in 
solr.home/conf/solrcore.properties.
I created a symlink in my core/conf to the solrcore.properties file, but I 
can't read the properties.

My properties file:
username=myusername
password=mypassword


My data-import.xml:
dataSource
type=JdbcDataSource
driver=com.mysql.jdbc.Driver
url=jdbc:mysql://${host}:3306/projectX
user=${username}
password=${password} /

Is the syntax correct?

Best regards,
Dariusz






On Thu, Jan 10, 2013 at 3:21 PM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 dataimport.properties is for DIH to store it's own properties for 
 delta processing and things. Try solrcore.properties instead, as per 
 recent
 discussion:

 http://lucene.472066.n3.nabble.com/Reading-database-connection-propert
 ies-from-external-file-td4031154.html

 Regards,
Alex.

 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all 
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD 
 book)


 On Thu, Jan 10, 2013 at 3:58 AM, Dariusz Borowski darius...@gmail.com
 wrote:

  I'm having a problem using a property file in my data-import.xml file.
 
  My aim is to not hard code some values inside my xml file, but 
  rather reusing the values from a property file. I'm using multicore 
  and some of the values are being changed from time to time and I do 
  not want to
 change
  them in all my data-import files.
 
  For example:
 
  dataSource
  type=JdbcDataSource
  driver=com.mysql.jdbc.Driver
  url=jdbc:mysql://${host}:3306/projectX
  user=${username}
  password=${password} /
 
  I tried everything, but don't know how I can use proporties here. I 
  tried to put my values in dataimport.properties, located under 
  SOLR-HOME/conf
  and under SOLR-HOME/core1/conf, but without any success.
 
  Please, could someone help me on this?
 



RE: Need help with delta import

2012-12-14 Thread Swati Swoboda
If I am not mistaken, it's suppose to be dataimporter.delta.ID and 
dataimporter.last_index_time You are using dataimport.delta.ID and 
dataimport.last_index_time

http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport



-Original Message-
From: umajava [mailto:umaj...@gmail.com] 
Sent: Thursday, December 13, 2012 9:35 PM
To: solr-user@lucene.apache.org
Subject: RE: Need help with delta import

Thanks a lot for your reply.

I have made the changes but it still does not work. I still get the same 
results. Any other suggestions please?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Need-help-with-delta-import-tp4025003p4026910.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Need help with delta import

2012-12-14 Thread Swati Swoboda
I am also confused, as I've been using dataimporter.* and not dih.* and it is 
working fine. 

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Friday, December 14, 2012 2:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Need help with delta import

On 12/14/2012 11:39 AM, Dyer, James wrote:
 Try ${dih.delta.ID} instead of ${dataimporter.delta.id}.  Also use 
 ${dih.last_index_time} instead of ${dataimporter.last_index_time} .  I 
 noticed when updating the test cases that the wiki incorrectly used the 
 longer name but with all the versions I tested this on only the short name 
 works.  The wiki has since been changed.

James,

I use DIH for full Solr reindexes.  My dih config makes extensive use of 
${dataimporter.request.XXX} variables for my own custom parameters.  I am using 
branch_4x checked out yesterday on my dev machine, and I did a full reindex on 
that version, which worked.  Three questions: 1) Should I be using 
${dih.request.XXX} instead?  2) Is the longer syntax going away?  3) What 
issues and/or docs would be good reading material?

Thanks,
Shawn



RE: Can a field with defined synonym be searched without the synonym?

2012-12-12 Thread Swati Swoboda
Query-time analyzers are still applied, even if you include a string in quotes. 
Would you expect foo to not match Foo just because it's enclosed in quotes?

Also look at this, someone who had similar requirements:
http://lucene.472066.n3.nabble.com/Synonym-Filter-disable-at-query-time-td2919876.html


-Original Message-
From: joe.cohe...@gmail.com [mailto:joe.cohe...@gmail.com] 
Sent: Wednesday, December 12, 2012 12:09 PM
To: solr-user@lucene.apache.org
Subject: Re: Can a field with defined synonym be searched without the synonym?


I'm aplying only query-time synonym, so I have the original values stored and 
indexed.
I would've expected that if I search a strin with quotations, i'll get the 
exact match, without applying a synonym.

any way to achieve that?


Upayavira wrote
 You can only search against terms that are stored in your index. If 
 you have applied index time synonyms, you can't remove them at query time.
 
 You can, however, use copyField to clone an incoming field to another 
 field that doesn't use synonyms, and search against that field instead.
 
 Upayavira
 
 On Wed, Dec 12, 2012, at 04:26 PM,

 joe.cohen.m@

  wrote:
 Hi
 I hava a field type without defined synonym.txt which retrieves both 
 records with home and house when I search either one of them.
 
 I want to be able to search this field on the specific value that I 
 enter, without the synonym filter.
 
 is it possible?
 
 thanks.
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-b
 e-searched-without-the-synonym-tp4026381.html
 Sent from the Solr - User mailing list archive at Nabble.com.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-be-searched-without-the-synonym-tp4026381p4026405.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Boost docs which are posted recently

2012-12-11 Thread Swati Swoboda
Hi Sangeetha,

If you need to boost based on date regardless of type, just use date boosting 
with a higher boost:

http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
http://wiki.apache.org/solr/FunctionQuery#Date_Boosting



-Original Message-
From: Sangeetha [mailto:sangeetha...@gmail.com] 
Sent: Tuesday, December 11, 2012 4:29 AM
To: solr-user@lucene.apache.org
Subject: Boost docs which are posted recently

Hi,

I have a doc with the field type_s. The value can be news, photos and videos.

The priority will be given in this order, photos, videos then news using the 
below query,

q=sachindefType=dismaxbq=type_s:photos^10bq=type_s:videos^7bq=type_s:news^5
 
eventhough it is giving more priority to photos, sometimes it needs to display 
the videos/news if it is posted recently.

How can i achieve this? Is it possible to use single bq for multiple field 
using space or +?

Thanks,
Sangeetha



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boost-docs-which-are-posted-recently-tp4025955.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Searching for phrase

2012-12-11 Thread Swati Swoboda
It's because you are escaping.

Look at this bit:

[parsedquery_toString] = +(smsc_content:abcdefg12345 smsc_content:678910 
smsc_description:abcdefg12345
smsc_content:678910) +smsc_lastdate:[1352627991000 TO 1386755331000]

It's searching for  as well because you escaped it (hence it is not considered 
'special' anymore, which, in this case, you want). Just search for 
abcdefg12345 678910 without escaping the quotes.

Your parsed query (in debug mode) should look something like this:
str name=parsedqueryPhraseQuery(smsc_content:abcdefg12345 678910)/str
str name=parsedquery_toStringsmsc_content:abcdefg12345 678910/str

Swati


-Original Message-
From: Arkadi Colson [mailto:ark...@smartbit.be] 
Sent: Tuesday, December 11, 2012 10:36 AM
To: solr  solr-user@lucene.apache.org
Subject: Searching for phrase

Hi

My schema looks like this:

 fieldType name=text class=solr.TextField 
positionIncrementGap=100
   analyzer type=index
 charFilter class=solr.HTMLStripCharFilterFactory/
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords_en.txt,stopwords_du.txt enablePositionIncrements=true/
 filter class=solr.ShingleFilterFactory maxShingleSize=3 
outputUnigrams=true/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory 
synonyms=synonyms.txt ignoreCase=true expand=true/--
 filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords_en.txt,stopwords_du.txt enablePositionIncrements=true/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType

I inserted these 2 strings into solr:
abcdefg12345 678910
abcdefg12345 xyz 678910

When searching for abcdefg12345 678910 with quotes I got no result. 
Without quotes both string are found.

SolrObject Object
(
 [responseHeader] = SolrObject Object
 (
 [status] = 0
 [QTime] = 38
 [params] = SolrObject Object
 (
 [sort] = score desc
 [indent] = on
 [collection] = intradesk
 [wt] = xml
 [version] = 2.2
 [rows] = 5
 [debugQuery] = true
 [fl] =
id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_content,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subject
 [start] = 0
 [q] = (smsc_content:\abcdefg12345 678910\ ||
smsc_description:\abcdefg12345 678910\)  
(smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) 
(smsc_ssid:929)
 )

 )

 [response] = SolrObject Object
 (
 [numFound] = 0
 [start] = 0
 [docs] =
 )

 [debug] = SolrObject Object
 (
 [rawquerystring] = (smsc_content:\abcdefg12345 678910\ 
|| smsc_description:\abcdefg12345 678910\) 
(smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) 
(smsc_ssid:929)
 [querystring] = (smsc_content:\abcdefg12345 678910\ ||
smsc_description:\abcdefg12345 678910\)  
(smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) 
(smsc_ssid:929)
 [parsedquery] = +(smsc_content:abcdefg12345 smsc_content:678910 
smsc_description:abcdefg12345
smsc_content:678910) +smsc_lastdate:[1352627991000 TO 1386755331000] 
+smsc_ssid:929
 [parsedquery_toString] = +(smsc_content:abcdefg12345 
smsc_content:678910 smsc_description:abcdefg12345
smsc_content:678910) +smsc_lastdate:[1352627991000 TO 1386755331000] 
+smsc_ssid:`#8;#0;#0;#7;!
 [QParser] = LuceneQParser
 [explain] = SolrObject Object
 (
 )

 )

)

Anybody an idea what's wrong?

--
Met vriendelijke groeten

Arkadi Colson

Smartbit bvba . Hoogstraat 13 . 3670 Meeuwen T +32 11 64 08 80 . F +32 11 64 08 
81



RE: highlighting multiple occurrences

2012-12-10 Thread Swati Swoboda
Did you mean that you want multiple snippets? 

http://wiki.apache.org/solr/HighlightingParameters#hl.snippets



-Original Message-
From: Rafael Ribeiro [mailto:rafae...@gmail.com] 
Sent: Monday, December 10, 2012 11:20 AM
To: solr-user@lucene.apache.org
Subject: highlighting multiple occurrences

Hi all,

 I have a solr instance with one field configured for highlighting as
follows:
 str name=hlon/str
 str name=hl.flconteudo/str
 str name=hl.fragsize500/str
 str name=hl.maxAnalyzedChars9/str
 str name=hl.simple.prelt;font style=background-color:
yellowgt;/str
 but I was willing to have the highlighter display multiple occurrences of the 
query instead of the first one... is it possible? I tried searching this 
mailing list but I couldn't find anyone mentioning this...

best regards,
Rafael



--
View this message in context: 
http://lucene.472066.n3.nabble.com/highlighting-multiple-occurrences-tp4025715.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Is there a way to round data when index, but still able to return original content?

2012-12-10 Thread Swati Swoboda
When you apply your analyzers/filters/tokenizers, the result value is kept in 
the indexed; however, the input value is actually stored. For example, from 
schema.xml file:

fieldType name=text class=solr.TextField positionIncrementGap=100
  analyzer
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

This particular field type will strip out the HTML. So if the input is:

bHello/b

It's being tokenized in the index as 

Hello

It's being stored (and hence returned to you) as

bHello/b

So you can create your own charFilter or filter class which converts your date 
for the indexer, but the original data will automatically be stored.

I hope this makes sense.

-Original Message-
From: jefferyyuan [mailto:yuanyun...@gmail.com] 
Sent: Monday, December 10, 2012 10:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Is there a way to round data when index, but still able to return 
original content?

Erick, Thanks for your reply.

I know how to implement the solution 1.

But no idea how yo implement the solution 2 you mentioned:
===
If you put some sort of (perhaps custom) filter in place, then the original 
value would go in as stored and the altered value would get in the index and 
you could do both in the same field. 

Can you please describe more about how to store original data and index the 
altered value in the same filed?

Thanks :)







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405p4025695.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: highlighting multiple occurrences

2012-12-10 Thread Swati Swoboda
Rafael,

Can you share more on how you are rendering the results in your velocity 
template? The data is probably being sent to you, but you have to loop through 
and actually access the data.

-Original Message-
From: Rafael Ribeiro [mailto:rafae...@gmail.com] 
Sent: Monday, December 10, 2012 2:26 PM
To: solr-user@lucene.apache.org
Subject: RE: highlighting multiple occurrences

yep!

 I tried enabling this and settings various values bot no success... still it 
only shows the first fragment of the search found...
 I also saw this
http://lucene.472066.n3.nabble.com/hl-snippets-in-solr-3-1-td2445178.html
but increasing maxAnalyzedChars (that was already huge) produced no difference 
at all.
 Do I have to change anything else? For example, something on the velocity 
template???

best regards,
Rafael



--
View this message in context: 
http://lucene.472066.n3.nabble.com/highlighting-multiple-occurrences-tp4025715p4025771.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Is there a way to round data when index, but still able to return original content?

2012-12-10 Thread Swati Swoboda
Hi,

Nope...they don't. Generally, I am not sure if I'd bother rounding this 
information to reduce the index size. Have you determined how much index size 
space you'll actually be saving? I am not confident that it'd be worth your 
time; i.e. I'd just go with indexing/storing the time information as well. 

Regardless, if you do want to go this route, the only way I can think of that 
wouldn't be a complicated solution is to have one field that is 
indexed/rounded (and not stored) and another field that is just stored (and not 
indexed).

Hope this helps.

-Original Message-
From: jefferyyuan [mailto:yuanyun...@gmail.com] 
Sent: Monday, December 10, 2012 3:14 PM
To: solr-user@lucene.apache.org
Subject: RE: Is there a way to round data when index, but still able to return 
original content?

Sorry to ask a question again, but I want to round date(TireDate) and 
TrieLongField, seems they don't support configuring analyzer: charFilter , 
tokenizer or filter.

What I should do? Now I am thinking to write my custom date or long field, is 
there any other way? :)

Thanks :)
 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405p4025793.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Selective field level security

2012-09-17 Thread Swati Swoboda
Hi Nalini,

We had similar requirements and this is how we did it (using your example):

Record A:
Field1_All: something
Field1_Private: something
Field2_All: ''
Field2_Private: something private
Field3_All: ''
Field3_Private: something very private

Fields_All: something
Fields_Private: something something private something very private

Basically, we're just using a lot of copy fields and dynamic fields. Instead of 
storing a type, we just change the column name. So if someone who had access to 
private fields, we would perform our search in the private column fields:

(fields_private:something)

Or if you want a specific field:

(field1_private:something) OR (field2_private:something) or 
(field3_private:something)

Likewise, if someone didn't have access to the private fields, we would only 
search in the all fields. We also created a super field so that we don't 
have to search each individual field -- we use copyfields to copy all private 
fields into the super field and just search that.

I hope this helps.

Swati

-Original Message-
From: Nalini Kartha [mailto:nalinikar...@gmail.com] 
Sent: Monday, September 17, 2012 2:45 PM
To: solr-user@lucene.apache.org
Subject: Selective field level security

Hi,

We're trying to push some security related info into the index which will 
control which users can search certain fields and we're wondering what the best 
way to accomplish this is.

Some records that are being indexed and searched can have certain fields marked 
as private. When a field is marked as private, some querying users should not 
see/search on it whereas some super users can.

Here's the solutions we're considering -

   - Index a separate boolean value into a new _INTERNAL field to indicate
   if the corresponding field value is marked private or not and include a
   filter in the query when the searching user is not a super user.

So for eg., consider that a record can contain 3 fields - field[123] where
field1 and field2 can be marked as private but field3 cannot.

Record A has only field1 marked as private, record B has both field1 and
field2 marked as private.

When we index these records here's what we'd end up with in the index -

Record A -
field1:something,  field1_INTERNAL:1, field2:something, field2_INTERNAL:0, 
field3:something Record B - field1:something,  field1_INTERNAL:1, 
field2:something, field2_INTERNAL:1, field3:something

If the searching user is NOT a super user then the query (let's say it's 
'hidden security') needs to look like this-

((field3:hidden) OR (field1:hidden AND field1_INTERNAL:0) OR (field2:hidden AND 
field2_INTERNAL:0)) AND ((field3:security) OR (field1:security AND
field1_INTERNAL:0) OR (field2:security AND field2_INTERNAL:0))

Manipulating the query this way seems painful and error prone so we're 
wondering if Solr provides anything out of the box that would help with this?


   - Index the private values themselves into a separate _INTERNAL field
   and then determine which fields to query depending on the visibility of the
   searching user.

So using the example from above, here's what the indexed records would look 
like -

Record A - field1_INTERNAL:something, field2:something,  field3:something 
Record B - field1_INTERNAL:something, field2_INTERNAL:something, 
field3:something

If the searching user is NOT a super user then the query just needs to be 
against the regular fields whereas if the searching user IS a super user, the 
query needs to be against BOTH the regular and INTERNAL fields.

The issue with this solution is that since the number of docs that include the 
INTERNAL fields is going to be much fewer we're wondering if relevancy would be 
messed up when we're querying both regular and internal fields for super users?

Thoughts?

Thanks,
Nalini


RE: Stats field with decimal values

2012-09-17 Thread Swati Swoboda
You can use an XSL response writer to transform your values to have a different 
precision.

http://wiki.apache.org/solr/XsltResponseWriter

Would most likely be better for your client to just do it on his end though. He 
is probably parsing the response anyway.

-Original Message-
From: Gustav [mailto:xbihy...@sharklasers.com] 
Sent: Monday, September 17, 2012 1:10 PM
To: solr-user@lucene.apache.org
Subject: Re: Stats field with decimal values

Well, my client is asking if is it possible, im just providing the search 
enginne to him, not working directly with the application. Dont know exactly in 
what language he is programming.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Stats-field-with-decimal-values-tp4008292p4008395.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr Index problem

2012-08-23 Thread Swati Swoboda
Are you committing? You have to commit for them to be actually added

-Original Message-
From: ranmatrix S [mailto:ranmat...@gmail.com] 
Sent: Thursday, August 23, 2012 5:46 PM
To: solr-user@lucene.apache.org
Subject: Solr Index problem

Hi,

I have setup Solr to index data from Oracle DB through DIH handler. However 
through Solr admin I could see the DB connection is successfull, data retrieved 
from DB to Solr but not added into index. The message is that 0 documents 
added even when I am able to see that 9 records are returned back.

The schema and fields in db-data-config.xml are one and the same.

Please suggest if anything I should look for.

--
Regards,
Ran...


RE: DataImportHandler WARNING: Unable to resolve variable

2012-08-09 Thread Swati Swoboda
I am getting a similar issue when while using a Template Transformer. My fields 
*always* have a value as well - it is getting indexed correctly.

Furthermore, the number of warnings I get seems arbitrary. I imported one 
document (debug mode) and I got roughly ~400 of those warning messages for the 
single field.

-Original Message-
From: Jon Drukman [mailto:jdruk...@gmail.com] 
Sent: Thursday, August 09, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: DataImportHandler WARNING: Unable to resolve variable

I'm trying to use DataImportHandler's delta-import functionality but I'm 
getting loads of these every time it runs:

WARNING: Unable to resolve variable: article.url_type while parsing
expression: article:${article.url_type}:${article.id}

The definition looks like:

entity name=article
query=... irrelevant ...

deltaQuery=select id,'dummy' as type_id FROM articles WHERE (post_date 
gt; '${dataimporter.last_index_time}' OR updated_date gt;
'${dataimporter.last_index_time}') AND post_date lt;= NOW() AND status = 9

deltaImportQuery=select id, article_seo_title,
DATE_FORMAT(post_date,'%Y-%m-%dT%H:%i:%sZ') post_date, subject,
   body, IF(url_type='', 'article', url_type) url_type, 
featured_image_url from articles WHERE id = ${dataimporter.delta.id}
   transformer=TemplateTransformer,HTMLStripTransformer
field column=id name=id /
field column=post_date name=post_date /
field column=subject name=title /
field column=body name=subhead stripHTML=true /
field column=type_id template=article:${article.url_type}:${
article.id} /
field column=type template=2 /
field column=featured_image_url name=main_image /
field column=article_seo_title name=seo_title /
/entity

As you can see, I am always making sure that article.url_type has some value.  
Why am I getting the warning?

-jsd-


RE: DataImportHandler WARNING: Unable to resolve variable

2012-08-09 Thread Swati Swoboda
Ah, my bad. I was incorrect - it was not actually indexing. 

@Jon - is there a possibility that your url_type is NULL, but not empty? Your 
if check only checks to see if it is empty, which is not the same as checking 
to see if it is null. If it is null, that's why you'd be having those errors - 
null values are just not accepted, it seems.

Swati

-Original Message-
From: Swati Swoboda [mailto:sswob...@igloosoftware.com] 
Sent: Thursday, August 09, 2012 11:09 PM
To: solr-user@lucene.apache.org
Subject: RE: DataImportHandler WARNING: Unable to resolve variable

I am getting a similar issue when while using a Template Transformer. My fields 
*always* have a value as well - it is getting indexed correctly.

Furthermore, the number of warnings I get seems arbitrary. I imported one 
document (debug mode) and I got roughly ~400 of those warning messages for the 
single field.

-Original Message-
From: Jon Drukman [mailto:jdruk...@gmail.com] 
Sent: Thursday, August 09, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: DataImportHandler WARNING: Unable to resolve variable

I'm trying to use DataImportHandler's delta-import functionality but I'm 
getting loads of these every time it runs:

WARNING: Unable to resolve variable: article.url_type while parsing
expression: article:${article.url_type}:${article.id}

The definition looks like:

entity name=article
query=... irrelevant ...

deltaQuery=select id,'dummy' as type_id FROM articles WHERE (post_date 
gt; '${dataimporter.last_index_time}' OR updated_date gt;
'${dataimporter.last_index_time}') AND post_date lt;= NOW() AND status = 9

deltaImportQuery=select id, article_seo_title,
DATE_FORMAT(post_date,'%Y-%m-%dT%H:%i:%sZ') post_date, subject,
   body, IF(url_type='', 'article', url_type) url_type, 
featured_image_url from articles WHERE id = ${dataimporter.delta.id}
   transformer=TemplateTransformer,HTMLStripTransformer
field column=id name=id /
field column=post_date name=post_date /
field column=subject name=title /
field column=body name=subhead stripHTML=true /
field column=type_id template=article:${article.url_type}:${
article.id} /
field column=type template=2 /
field column=featured_image_url name=main_image /
field column=article_seo_title name=seo_title /
/entity

As you can see, I am always making sure that article.url_type has some value.  
Why am I getting the warning?

-jsd-