Re: [Dspace-tech] [Dspace-general] How to implements Lucene
Hi Álvaro, first of all, please, don't cross-post. When you reply to this mail, please don't include dspace-general. This question belongs to dspace-tech, so let's continue the conversation there. Your question is not entirely clear. If you want to build a search engine (also known as a central index in the library world) using Lucene, why are you asking at a DSpace mailing list? DSpace happens to use Lucene (actually in two independent forms - Lucene and Solr), but from the point of view of a search engine, that doesn't matter. You should access its data via network protocols. Now you have two options, depending on what you want to build: a) if you want to build a metasearch engine (that issues a requests to multiple systems at the same time and waits for all of them to respond), then to connect DSpace as a source you may use its Solr search interface [1] b) if you want to build a central index (that harvests its source systems periodically and then queries only its own index), then connect to DSpace via OAI-PMH and harvest all available items [2] Also, if you're thinking of building upon Lucene, you may want to go with Solr or ElasticSearch from the start. They both build on top of Lucene. [1] http://wiki.duraspace.org/display/DSPACE/Solr [2] https://wiki.duraspace.org/display/DSDOC3x/OAI Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] [Dspace-general] How to implements Lucene
On Mon, Jan 28, 2013 at 6:17 PM, Álvaro Vargas Quezada al...@outlook.com wrote: Ok, let's see if I understood: I want to have a central index, that provides me a basic interface to search in various data sources (i.e. databases), the method that we want to use is the second: index periodically and consults its own index, I guess this way is faster. OK So, that means that we are gonna use DSpace + Lucene with OAI protocol, isn't? As I wrote, this part is confusing. For your central index you don't need DSpace at all, DSpace is just one of the sources that can be connected. Right? Just confirming that we understand each other. If you think you need DSpace for your central index to work, then we don't understand each other and you need to explain more about what you want to achieve. You recomemend me to use OAI protocol, I agree but, is this possible with other databases like SQL and Oracle? So, the architecture would be your central index harvesting from multiple sources, e.g.: Central index: * your own Lucene-based program Sources * DSpace (via OAI-PMH) * Oracle (via SQL) * some other source, e.g. via Solr * yet another source, e.g. via CSV import * ... So SQL could be just another harvesting protocol (e.g. SELECT title, author, keywords FROM items). And I wanted to write one more important information in my previous email, but I forgot. Maybe you don't need to start from scratch. Take a look at RCAAP [1] [2] and if you think you want something like that, maybe you don't have to start from scratch. Contact João Melo in that case. [1] http://www.rcaap.pt [2] http://www.rcaap.pt/help.jsp Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] [Dspace-general] How to implements Lucene
Perfect, I was confused about the use of DSpace, you're right, we don't need DSpace at all, with Lucene and those protocols will be enough, in the end, Again, thank you very much for the information helix84, i will investigate RCAAP. Greets! From: heli...@centrum.sk Date: Mon, 28 Jan 2013 18:50:04 +0100 Subject: Re: [Dspace-general] How to implements Lucene To: al...@outlook.com CC: dspace-tech@lists.sourceforge.net; alvaro.vargas.quez...@gmail.com On Mon, Jan 28, 2013 at 6:17 PM, Álvaro Vargas Quezada al...@outlook.com wrote: Ok, let's see if I understood: I want to have a central index, that provides me a basic interface to search in various data sources (i.e. databases), the method that we want to use is the second: index periodically and consults its own index, I guess this way is faster. OK So, that means that we are gonna use DSpace + Lucene with OAI protocol, isn't? As I wrote, this part is confusing. For your central index you don't need DSpace at all, DSpace is just one of the sources that can be connected. Right? Just confirming that we understand each other. If you think you need DSpace for your central index to work, then we don't understand each other and you need to explain more about what you want to achieve. You recomemend me to use OAI protocol, I agree but, is this possible with other databases like SQL and Oracle? So, the architecture would be your central index harvesting from multiple sources, e.g.: Central index: * your own Lucene-based program Sources * DSpace (via OAI-PMH) * Oracle (via SQL) * some other source, e.g. via Solr * yet another source, e.g. via CSV import * ... So SQL could be just another harvesting protocol (e.g. SELECT title, author, keywords FROM items). And I wanted to write one more important information in my previous email, but I forgot. Maybe you don't need to start from scratch. Take a look at RCAAP [1] [2] and if you think you want something like that, maybe you don't have to start from scratch. Contact João Melo in that case. [1] http://www.rcaap.pt [2] http://www.rcaap.pt/help.jsp Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette