Re: Fast autocomplete for large dataset
Thank you Eric for your reply. If I understand it seems that these approaches are using index to hold terms. As the index grows bigger, it can be a performance issues. Is it right? Please can you check this article http://www.norconex.com/serving-autocomplete-suggestions-fast/ to see what I mean? Thank you. Regards Olivier 2015-08-01 17:42 GMT+02:00 Erick Erickson erickerick...@gmail.com: Well, defining what you mean by autocomplete would be a start. If it's just a user types some letters and you suggest the next N terms in the list, TermsComponent will fix you right up. If it's more complicated, the AutoSuggest functionality might help. If it's correcting spelling, there's the spellchecker. Best, Erick On Sat, Aug 1, 2015 at 10:00 AM, Olivier Austina olivier.aust...@gmail.com wrote: Hi, I am looking for a fast and easy to maintain way to do autocomplete for large dataset in solr. I heard about Ternary Search Tree (TST) https://en.wikipedia.org/wiki/Ternary_search_tree. But I would like to know if there is something I missed such as best practice, Solr new feature. Any suggestion is welcome. Thank you. Regards Olivier
Re: Fast autocomplete for large dataset
Thank you Eric, I would like to implement an autocomplete for large dataset. The autocomplete should show the phrase or the question the user want as the user types. The requirement is that the autocomplete should be fast (not slowdown by the volume of data as dataset become bigger), and easy to maintain. The autocomplete can have its own Solr server. It is an autocomplete like others but it should be only fast and easy to maintain. What is the limitations of suggesters mentioned in the article? Thank you. Regards Olivier 2015-08-01 19:41 GMT+02:00 Erick Erickson erickerick...@gmail.com: Not really. There's no need to use ngrams as the article suggests if the terms component does what you need. Which is why I asked you about what autocomplete means in your context. Which you have not clarified. Have you even looked at terms component? Especially the terms.prefix option? Terms component has it's limitations, but performance isn't one of them. The suggesters mentioned in the article have other limitations. It's really useless to discuss those limitations, though, until the problem you're trying to solve is clearly stated. On Aug 1, 2015 1:01 PM, Olivier Austina olivier.aust...@gmail.com wrote: Thank you Eric for your reply. If I understand it seems that these approaches are using index to hold terms. As the index grows bigger, it can be a performance issues. Is it right? Please can you check this article http://www.norconex.com/serving-autocomplete-suggestions-fast/ to see what I mean? Thank you. Regards Olivier 2015-08-01 17:42 GMT+02:00 Erick Erickson erickerick...@gmail.com: Well, defining what you mean by autocomplete would be a start. If it's just a user types some letters and you suggest the next N terms in the list, TermsComponent will fix you right up. If it's more complicated, the AutoSuggest functionality might help. If it's correcting spelling, there's the spellchecker. Best, Erick On Sat, Aug 1, 2015 at 10:00 AM, Olivier Austina olivier.aust...@gmail.com wrote: Hi, I am looking for a fast and easy to maintain way to do autocomplete for large dataset in solr. I heard about Ternary Search Tree (TST) https://en.wikipedia.org/wiki/Ternary_search_tree. But I would like to know if there is something I missed such as best practice, Solr new feature. Any suggestion is welcome. Thank you. Regards Olivier
Fast autocomplete for large dataset
Hi, I am looking for a fast and easy to maintain way to do autocomplete for large dataset in solr. I heard about Ternary Search Tree (TST) https://en.wikipedia.org/wiki/Ternary_search_tree. But I would like to know if there is something I missed such as best practice, Solr new feature. Any suggestion is welcome. Thank you. Regards Olivier
Re: Fast autocomplete for large dataset
Thank you Eric for your replies and the link. Regards Olivier 2015-08-02 3:47 GMT+02:00 Erick Erickson erickerick...@gmail.com: Here's some background: http://lucidworks.com/blog/solr-suggester/ Basically, the limitation is that to build the suggester all docs in the index need to be read to pull out the stored field and build either the FST or the sidecar Lucene index, which can be a _very_ costly operation (as in minutes/hours for a large dataset). bq: The requirement is that the autocomplete should be fast (not slowdown by the volume of data as dataset become bigger) Well, in some alternate universe this may be possible. But the larger the corpus the slower the processing will be, there's just no way around that. Whether it's fast enough for your application is a better question ;). Best, Erick On Sat, Aug 1, 2015 at 2:05 PM, Olivier Austina olivier.aust...@gmail.com wrote: Thank you Eric, I would like to implement an autocomplete for large dataset. The autocomplete should show the phrase or the question the user want as the user types. The requirement is that the autocomplete should be fast (not slowdown by the volume of data as dataset become bigger), and easy to maintain. The autocomplete can have its own Solr server. It is an autocomplete like others but it should be only fast and easy to maintain. What is the limitations of suggesters mentioned in the article? Thank you. Regards Olivier 2015-08-01 19:41 GMT+02:00 Erick Erickson erickerick...@gmail.com: Not really. There's no need to use ngrams as the article suggests if the terms component does what you need. Which is why I asked you about what autocomplete means in your context. Which you have not clarified. Have you even looked at terms component? Especially the terms.prefix option? Terms component has it's limitations, but performance isn't one of them. The suggesters mentioned in the article have other limitations. It's really useless to discuss those limitations, though, until the problem you're trying to solve is clearly stated. On Aug 1, 2015 1:01 PM, Olivier Austina olivier.aust...@gmail.com wrote: Thank you Eric for your reply. If I understand it seems that these approaches are using index to hold terms. As the index grows bigger, it can be a performance issues. Is it right? Please can you check this article http://www.norconex.com/serving-autocomplete-suggestions-fast/ to see what I mean? Thank you. Regards Olivier 2015-08-01 17:42 GMT+02:00 Erick Erickson erickerick...@gmail.com: Well, defining what you mean by autocomplete would be a start. If it's just a user types some letters and you suggest the next N terms in the list, TermsComponent will fix you right up. If it's more complicated, the AutoSuggest functionality might help. If it's correcting spelling, there's the spellchecker. Best, Erick On Sat, Aug 1, 2015 at 10:00 AM, Olivier Austina olivier.aust...@gmail.com wrote: Hi, I am looking for a fast and easy to maintain way to do autocomplete for large dataset in solr. I heard about Ternary Search Tree (TST) https://en.wikipedia.org/wiki/Ternary_search_tree. But I would like to know if there is something I missed such as best practice, Solr new feature. Any suggestion is welcome. Thank you. Regards Olivier
Re: How to implement Auto complete, suggestion client side
Hi, Thank you Dan Davis and Alexandre Rafalovitch. This is very helpful for me. Regards Olivier 2015-01-27 0:51 GMT+01:00 Alexandre Rafalovitch arafa...@gmail.com: You've got a lot of options depending on what you want. But since you seem to just want _an_ example, you can use mine from http://www.solr-start.com/javadoc/solr-lucene/index.html (gray search box there). You can see the source for the test screen (using Spring Boot and Spring Data Solr as a middle-layer) and Select2 for the UI at: https://github.com/arafalov/Solr-Javadoc/tree/master/SearchServer. The Solr definition is at: https://github.com/arafalov/Solr-Javadoc/tree/master/JavadocIndex/JavadocCollection/conf Other implementation pieces are in that (and another) public repository as well, but it's all in Java. You'll probably want to do something similar in PHP. Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 26 January 2015 at 17:11, Olivier Austina olivier.aust...@gmail.com wrote: Hi All, I would say I am new to web technology. I would like to implement auto complete/suggestion in the user search box as the user type in the search box (like Google for example). I am using Solr as database. Basically I am familiar with Solr and I can formulate suggestion queries. But now I don't know how to implement suggestion in the User Interface. Which technologies should I need. The website is in PHP. Any suggestions, examples, basic tutorial is welcome. Thank you. Regards Olivier
How to implement Auto complete, suggestion client side
Hi All, I would say I am new to web technology. I would like to implement auto complete/suggestion in the user search box as the user type in the search box (like Google for example). I am using Solr as database. Basically I am familiar with Solr and I can formulate suggestion queries. But now I don't know how to implement suggestion in the User Interface. Which technologies should I need. The website is in PHP. Any suggestions, examples, basic tutorial is welcome. Thank you. Regards Olivier
Architecture for PHP web site, Solr and an application
Hi, I would like to query only some fields in Solr depend on the user input as I know the fields. The user send an HTML form to the PHP website. The application get the fields and their content from the PHP web site. The application then formulate a query to Solr based on this fields and other contextual information. Only fields from the HTML form are used. The forms don't have the same fields. The application is not yet developed. It could be in C++, Java or other language using a database. It uses more resources. I am wondering which architecture is suitable for this case: -How to make the architecture scalable (to support more users) -How to make PHP communicate with the application if this application is not in PHP. Any suggestion is welcome. Thank you. Regards Olivier
UI for Solr
Hi, I would like to build a User Interface on top of Solr for PC and mobile. I am wondering if there is a framework, best practice commonly used. I want Solr features such as suggestion, auto complete, facet to be available for UI. Any suggestion is welcome. Than you. Regards Olivier
Re: UI for Solr
Hi Alex, Thank you for prompt reply. I am not aware of Spring.io's Spring Data Solr. Regards Olivier 2014-12-23 16:50 GMT+01:00 Alexandre Rafalovitch arafa...@gmail.com: You don't expose Solr directly to the user, it is not setup for full-proof security out of the box. So you would need a client to talk to Solr. Something like Spring.io's Spring Data Solr could be one of the things to check. You can see an auto-complete example for it at: https://github.com/arafalov/Solr-Javadoc/tree/master/SearchServer/src/main and embedded in action at http://www.solr-start.com/javadoc/solr-lucene/index.html (search box on the top) Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 23 December 2014 at 10:45, Olivier Austina olivier.aust...@gmail.com wrote: Hi, I would like to build a User Interface on top of Solr for PC and mobile. I am wondering if there is a framework, best practice commonly used. I want Solr features such as suggestion, auto complete, facet to be available for UI. Any suggestion is welcome. Than you. Regards Olivier
Re: Indexing documents/files for production use
Thank you Alexandre, Jürgen and Erick for your replies. It is clear for me. Regards Olivier 2014-10-28 23:35 GMT+01:00 Erick Erickson erickerick...@gmail.com: And one other consideration in addition to the two excellent responses so far In a SolrCloud environment, SolrJ via CloudSolrServer will automatically route the documents to the correct shard leader, saving some additional overhead. Post.jar and cURL send the docs to a node, which in turn forward the docs to the correct shard leader which lowers throughput Best, Erick On Tue, Oct 28, 2014 at 2:32 PM, Jürgen Wagner (DVT) juergen.wag...@devoteam.com wrote: Hello Olivier, for real production use, you won't really want to use any toys like post.jar or curl. You want a decent connector to whatever data source there is, that fetches data, possibly massages it a bit, and then feeds it into Solr - by means of SolrJ or directly into the web service of Solr via binary protocols. This way, you can properly handle incremental feeding, processing of data from remote locations (with the connector being closer to the data source), and also source data security. Also think about what happens if you do processing of incoming documents in Solr. What happens if Tika runs out of memory because of PDF problems? What if this crashes your Solr node? In our Solr projects, we generally do not do any sizable processing within Solr as document processing and document indexing or querying have all different scaling properties. Production use most typically is not achieved by deploying a vanilla Solr, but rather having a bit more glue and wrappage, so the whole will fit your requirements in terms of functionality, scaling, monitoring and robustness. Some similar platforms like Elasticsearch try to alleviate these pains of going to a production-style infrastructure, but that's at the expense of flexibility and comes with limitations. For proof-of-concept or demonstrator-style applications, the plain tools out of the box will be fine. For production applications, you want to have more robust components. Best regards, --Jürgen On 28.10.2014 22:12, Olivier Austina wrote: Hi All, I am reading the solr documentation. I have understood that post.jar http://wiki.apache.org/solr/ExtractingRequestHandler#SimplePostTool_.28post.jar.29 is not meant for production use, cURL https://cwiki.apache.org/confluence/display/solr/Introduction+to+Solr+Indexing is not recommanded. Is SolrJ better for production? Thank you. Regards Olivier -- Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С уважением i.A. Jürgen Wagner Head of Competence Center Intelligence Senior Cloud Consultant Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543 E-Mail: juergen.wag...@devoteam.com, URL: www.devoteam.de Managing Board: Jürgen Hatzipantelis (CEO) Address of Record: 64331 Weiterstadt, Germany; Commercial Register: Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071
Indexing documents/files for production use
Hi All, I am reading the solr documentation. I have understood that post.jar http://wiki.apache.org/solr/ExtractingRequestHandler#SimplePostTool_.28post.jar.29 is not meant for production use, cURL https://cwiki.apache.org/confluence/display/solr/Introduction+to+Solr+Indexing is not recommanded. Is SolrJ better for production? Thank you. Regards Olivier
OpenExchangeRates.Org rates in solr
Hi, There is a way to see the OpenExchangeRates.Org http://www.OpenExchangeRates.Org rates used in Solr somewhere. I have changed the configuration to use these rates. Thank you. Regards Olivier
Re: OpenExchangeRates.Org rates in solr
Hi Will, I am learning Solr now. I can use it later for business or for free access. Thank you. Regards Olivier 2014-10-26 17:32 GMT+01:00 Will Martin wmartin...@gmail.com: Hi Olivier: Can you clarify this message? Are you using Solr at the business? Or are you giving free access to solr installations? Thanks, Will -Original Message- From: Olivier Austina [mailto:olivier.aust...@gmail.com] Sent: Sunday, October 26, 2014 10:57 AM To: solr-user@lucene.apache.org Subject: OpenExchangeRates.Org rates in solr Hi, There is a way to see the OpenExchangeRates.Org http://www.OpenExchangeRates.Org rates used in Solr somewhere. I have changed the configuration to use these rates. Thank you. Regards Olivier
Re: Remove indexes of XML file
Thank you Alex, I think I can use the file to delete corresponding indexes. Regards Olivier 2014-10-24 21:51 GMT+02:00 Alexandre Rafalovitch arafa...@gmail.com: You can delete individually, all (*:* query) or by specific query. So, if there is no common query pattern you may need to do a multi-id query - something like id:(id1 id2 id3 id4) which does require you knowing the IDs. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 24 October 2014 15:44, Olivier Austina olivier.aust...@gmail.com wrote: Hi, This is newbie question. I have indexed some documents using some XML files as indicating in the tutorial http://lucene.apache.org/solr/4_10_1/tutorial.html with the command : java -jar post.jar *.xml I have seen how to delete an index for one document but how to delete all indexes for documents within an XML file. For example if I have indexed some files A, B, C, D etc., how to delete indexes of documents from file C. Is there a command like above or other solution without using individual ID? Thank you. Regards Olivier
Remove indexes of XML file
Hi, This is newbie question. I have indexed some documents using some XML files as indicating in the tutorial http://lucene.apache.org/solr/4_10_1/tutorial.html with the command : java -jar post.jar *.xml I have seen how to delete an index for one document but how to delete all indexes for documents within an XML file. For example if I have indexed some files A, B, C, D etc., how to delete indexes of documents from file C. Is there a command like above or other solution without using individual ID? Thank you. Regards Olivier
Website running Solr
Hi All, Is there a way to know if a website use Solr? Thanks. Regards Olivier
Topology of Solr use
Hi All, I would to have an idea about Solr usage: number of users, industry, countries or any helpful information. Thank you. Regards Olivier
Re: Topology of Solr use
Thank you Markus, the link is very useful. Regards Olivier 2014-04-17 18:24 GMT+02:00 Markus Jelsma markus.jel...@openindex.io: This may help a bit: https://wiki.apache.org/solr/PublicServers -Original message- From:Olivier Austina olivier.aust...@gmail.com Sent:Thu 17-04-2014 18:16 Subject:Topology of Solr use To:solr-user@lucene.apache.org; Hi All, I would to have an idea about Solr usage: number of users, industry, countries or any helpful information. Thank you. Regards Olivier
Querying specific database attributes or table
Hi, I am new to Solr. I would like to index and querying a relational database. Is it possible to query a specific table or attribute of the database. Example if I have 2 tables A and B both have the attribute name and I want to have only the results form the table A and not from table B. Is it possible? Can I restrict the query to only one table without having result from others table? Is it possible to query a specific attribute of a table? Is it possible to do join query like SQL? Any suggestion is welcome. Thank you. Regards Olivier