[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product
[ https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886548#action_12886548 ] Karl Wright commented on CONNECTORS-55: --- Mark, it took most of a month to get Derby working, and to do it I needed to disable certain functionality in LCF. No performance tuning or analysis has yet been done on Derby, and I would not be surprised if another month was required to complete that. Point being that it is by no means ever a "plug and play" operation to switch databases - there are just way too many side effects (e.g. query A performs wonderfully on database X, but you need to use query B or you're dead on database Y). Jack, for example, was extremely surprised to learn that embedded Derby would not allow more than one process to access the database at a time - and Jack was the one advocating most strongly for Derby support! I therefore strongly suggest a cautious approach when considering Introducing additional databases. Testing of any change also becomes much more difficult the more supported databases there are. So, in my view, one really must ask, "What unmet scenario do you see that would demand support for this database?", before just going ahead and deciding to support whatever may be out there. I realize this cautious approach is diametrically opposed to your stated goal of supporting "other java databases". Perhaps you could clarify your request so that we could understand your true goal here. > Bundle database server with LCF packaged product > > > Key: CONNECTORS-55 > URL: https://issues.apache.org/jira/browse/CONNECTORS-55 > Project: Lucene Connector Framework > Issue Type: Improvement > Components: Framework core >Reporter: Jack Krupansky > > The current requirement that the user install and deploy a PostgreSQL server > complicates the installation and deployment of LCF for the user. Installation > and deployment of LCF should be as simple as Solr itself. QuickStart is great > for the low-end and basic evaluation, but a comparable level of simplified > installation and deployment is still needed for full-blown, high-end > environments that need the full performance of a ProstgreSQL-class database > server. So, PostgreSQL should be bundled with the packaged release of LCF so > that installation and deployment of LCF will automatically install and deploy > a subset of the full PostgreSQL distribution that is sufficient for the needs > of LCF. Starting LCF, with or without the LCF UI, should automatically start > the database server. Shutting down LCF should also shutdown the database > server process. > A typical use case would be for a non-developer who is comfortable with Solr > and simply wants to crawl documents from, for example, a SharePoint > repository and feed them into Solr. QuickStart should work well for the low > end or in the early stages of evaluation, but the user would prefer to > evaluate "the real thing" with something resembling a production crawl of > thousands of documents. Such a user might not be a hard-core developer or be > comfortable fiddling with a lot of software components simply to do one > conceptually simple operation. > It should still be possible for the user to supply database server settings > to override the defaults, but the LCF package should have all of the > best-practice settings deemed appropriate for use with LCF. > One downside is that installation and deployment will be platform-specific > since there are multiple processes and PostgreSQL itself requires a > platform-specific installation. > This proposal presumes that PostgreSQL is the best option for the foreseeable > future, but nothing here is intended to preclude support for other database > servers in futures releases. > This proposal should not have any impact on QuickStart packaging or > deployment. > Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CONNECTORS-56) All features should be accessible through an API
All features should be accessible through an API Key: CONNECTORS-56 URL: https://issues.apache.org/jira/browse/CONNECTORS-56 Project: Lucene Connector Framework Issue Type: Improvement Components: Framework core Reporter: Jack Krupansky LCF consists of a full-featured crawling engine and a full-featured user interface to access the features of that engine, but some applications are better served with a full API that lets the application control the crawling engine, including creation and editing of connections and creation, editing, and control of jobs. Put simply, everything that a user can accomplish via the LCF UI should be doable through an LCF API. All LCF objects should be queryable through the API. A primary use case is Solr applications which currently use Aperture for crawling, but would prefer the full-featured capabilities of LCF as a crawling engine over Aperture. I do not wish to over-specify the API in this initial description, but I think the LCF API should probably be a traditional REST API., with some of the API elements specified via the context path, some parameters via URL query parameters, and complex, detailed structures as JSON (or similar.). The precise details of the API are beyond the scope of this initial description and will be added incrementally once the high-level approach to the API becomes reasonably settled. A job status and event reporting scheme is also needed in conjunction with the LCF API. That requirement has already been captured as CONNECTORS-41. The intention for the API is to create, edit, access, and control all of the objects managed by LCF. The main focus is on repositories, jobs, and status, and less about document-specific crawling information, but there may be some benefit to querying crawling status for individual documents as well. Nothing in this proposal should in any way limit or constrain the features that will be available in the LCF UI. The intent is that LCF should continue to have a full-featured UI, but in addition to a full-featured API. Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product
[ https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886490#action_12886490 ] Jack Krupansky commented on CONNECTORS-55: -- I was using the term "install" loosely, not so much the way a typical package has a GUI wizard and lots of stuff going on, but more in the sense of raw Solr where you download, unzip, and files are in sub directories right where they need to be. In that sense, the theory is that a subset of PostgreSQL could be in a subdirectory. Some enterprising vendor, such as Lucid Imagination, might want to have a fancy GUI install, but that would be beyond the scope of what I intended here. > Bundle database server with LCF packaged product > > > Key: CONNECTORS-55 > URL: https://issues.apache.org/jira/browse/CONNECTORS-55 > Project: Lucene Connector Framework > Issue Type: Improvement > Components: Framework core >Reporter: Jack Krupansky > > The current requirement that the user install and deploy a PostgreSQL server > complicates the installation and deployment of LCF for the user. Installation > and deployment of LCF should be as simple as Solr itself. QuickStart is great > for the low-end and basic evaluation, but a comparable level of simplified > installation and deployment is still needed for full-blown, high-end > environments that need the full performance of a ProstgreSQL-class database > server. So, PostgreSQL should be bundled with the packaged release of LCF so > that installation and deployment of LCF will automatically install and deploy > a subset of the full PostgreSQL distribution that is sufficient for the needs > of LCF. Starting LCF, with or without the LCF UI, should automatically start > the database server. Shutting down LCF should also shutdown the database > server process. > A typical use case would be for a non-developer who is comfortable with Solr > and simply wants to crawl documents from, for example, a SharePoint > repository and feed them into Solr. QuickStart should work well for the low > end or in the early stages of evaluation, but the user would prefer to > evaluate "the real thing" with something resembling a production crawl of > thousands of documents. Such a user might not be a hard-core developer or be > comfortable fiddling with a lot of software components simply to do one > conceptually simple operation. > It should still be possible for the user to supply database server settings > to override the defaults, but the LCF package should have all of the > best-practice settings deemed appropriate for use with LCF. > One downside is that installation and deployment will be platform-specific > since there are multiple processes and PostgreSQL itself requires a > platform-specific installation. > This proposal presumes that PostgreSQL is the best option for the foreseeable > future, but nothing here is intended to preclude support for other database > servers in futures releases. > This proposal should not have any impact on QuickStart packaging or > deployment. > Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product
[ https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886494#action_12886494 ] Mark Miller commented on CONNECTORS-55: --- All the more reason to get LCF working completely with other Java databases. > Bundle database server with LCF packaged product > > > Key: CONNECTORS-55 > URL: https://issues.apache.org/jira/browse/CONNECTORS-55 > Project: Lucene Connector Framework > Issue Type: Improvement > Components: Framework core >Reporter: Jack Krupansky > > The current requirement that the user install and deploy a PostgreSQL server > complicates the installation and deployment of LCF for the user. Installation > and deployment of LCF should be as simple as Solr itself. QuickStart is great > for the low-end and basic evaluation, but a comparable level of simplified > installation and deployment is still needed for full-blown, high-end > environments that need the full performance of a ProstgreSQL-class database > server. So, PostgreSQL should be bundled with the packaged release of LCF so > that installation and deployment of LCF will automatically install and deploy > a subset of the full PostgreSQL distribution that is sufficient for the needs > of LCF. Starting LCF, with or without the LCF UI, should automatically start > the database server. Shutting down LCF should also shutdown the database > server process. > A typical use case would be for a non-developer who is comfortable with Solr > and simply wants to crawl documents from, for example, a SharePoint > repository and feed them into Solr. QuickStart should work well for the low > end or in the early stages of evaluation, but the user would prefer to > evaluate "the real thing" with something resembling a production crawl of > thousands of documents. Such a user might not be a hard-core developer or be > comfortable fiddling with a lot of software components simply to do one > conceptually simple operation. > It should still be possible for the user to supply database server settings > to override the defaults, but the LCF package should have all of the > best-practice settings deemed appropriate for use with LCF. > One downside is that installation and deployment will be platform-specific > since there are multiple processes and PostgreSQL itself requires a > platform-specific installation. > This proposal presumes that PostgreSQL is the best option for the foreseeable > future, but nothing here is intended to preclude support for other database > servers in futures releases. > This proposal should not have any impact on QuickStart packaging or > deployment. > Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product
[ https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886475#action_12886475 ] Karl Wright commented on CONNECTORS-55: --- Hi jack, This seems to me to be beyond the scope of most open-source installers. I've constructed installers involving Postgres before and the integration possibilities are very limited. Furthermore, you would need a totally different installer for windows, debian, redhat, solaris, the mac, etc. Many of these platforms do not work well with bundles but instead use a dependency model in any case. --- original message --- From: "ext Jack Krupansky (JIRA)" Subject: [jira] Created: (CONNECTORS-55) Bundle database server with LCF packaged product Date: July 8, 2010 Time: 4:35:20 PM Bundle database server with LCF packaged product Key: CONNECTORS-55 URL: https://issues.apache.org/jira/browse/CONNECTORS-55 Project: Lucene Connector Framework Issue Type: Improvement Components: Framework core Reporter: Jack Krupansky The current requirement that the user install and deploy a PostgreSQL server complicates the installation and deployment of LCF for the user. Installation and deployment of LCF should be as simple as Solr itself. QuickStart is great for the low-end and basic evaluation, but a comparable level of simplified installation and deployment is still needed for full-blown, high-end environments that need the full performance of a ProstgreSQL-class database server. So, PostgreSQL should be bundled with the packaged release of LCF so that installation and deployment of LCF will automatically install and deploy a subset of the full PostgreSQL distribution that is sufficient for the needs of LCF. Starting LCF, with or without the LCF UI, should automatically start the database server. Shutting down LCF should also shutdown the database server process. A typical use case would be for a non-developer who is comfortable with Solr and simply wants to crawl documents from, for example, a SharePoint repository and feed them into Solr. QuickStart should work well for the low end or in the early stages of evaluation, but the user would prefer to evaluate "the real thing" with something resembling a production crawl of thousands of documents. Such a user might not be a hard-core developer or be comfortable fiddling with a lot of software components simply to do one conceptually simple operation. It should still be possible for the user to supply database server settings to override the defaults, but the LCF package should have all of the best-practice settings deemed appropriate for use with LCF. One downside is that installation and deployment will be platform-specific since there are multiple processes and PostgreSQL itself requires a platform-specific installation. This proposal presumes that PostgreSQL is the best option for the foreseeable future, but nothing here is intended to preclude support for other database servers in futures releases. This proposal should not have any impact on QuickStart packaging or deployment. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. > Bundle database server with LCF packaged product > > > Key: CONNECTORS-55 > URL: https://issues.apache.org/jira/browse/CONNECTORS-55 > Project: Lucene Connector Framework > Issue Type: Improvement > Components: Framework core >Reporter: Jack Krupansky > > The current requirement that the user install and deploy a PostgreSQL server > complicates the installation and deployment of LCF for the user. Installation > and deployment of LCF should be as simple as Solr itself. QuickStart is great > for the low-end and basic evaluation, but a comparable level of simplified > installation and deployment is still needed for full-blown, high-end > environments that need the full performance of a ProstgreSQL-class database > server. So, PostgreSQL should be bundled with the packaged release of LCF so > that installation and deployment of LCF will automatically install and deploy > a subset of the full PostgreSQL distribution that is sufficient for the needs > of LCF. Starting LCF, with or without the LCF UI, should automatically start > the database server. Shutting down LCF should also shutdown the database > server process. > A typical use case would be for a non-developer who is comfortable with Solr > and simply wants to crawl documents from, for example, a SharePoint > repository and feed them into Solr. QuickStart should work well for the low > end or i
[jira] Created: (CONNECTORS-55) Bundle database server with LCF packaged product
Bundle database server with LCF packaged product Key: CONNECTORS-55 URL: https://issues.apache.org/jira/browse/CONNECTORS-55 Project: Lucene Connector Framework Issue Type: Improvement Components: Framework core Reporter: Jack Krupansky The current requirement that the user install and deploy a PostgreSQL server complicates the installation and deployment of LCF for the user. Installation and deployment of LCF should be as simple as Solr itself. QuickStart is great for the low-end and basic evaluation, but a comparable level of simplified installation and deployment is still needed for full-blown, high-end environments that need the full performance of a ProstgreSQL-class database server. So, PostgreSQL should be bundled with the packaged release of LCF so that installation and deployment of LCF will automatically install and deploy a subset of the full PostgreSQL distribution that is sufficient for the needs of LCF. Starting LCF, with or without the LCF UI, should automatically start the database server. Shutting down LCF should also shutdown the database server process. A typical use case would be for a non-developer who is comfortable with Solr and simply wants to crawl documents from, for example, a SharePoint repository and feed them into Solr. QuickStart should work well for the low end or in the early stages of evaluation, but the user would prefer to evaluate "the real thing" with something resembling a production crawl of thousands of documents. Such a user might not be a hard-core developer or be comfortable fiddling with a lot of software components simply to do one conceptually simple operation. It should still be possible for the user to supply database server settings to override the defaults, but the LCF package should have all of the best-practice settings deemed appropriate for use with LCF. One downside is that installation and deployment will be platform-specific since there are multiple processes and PostgreSQL itself requires a platform-specific installation. This proposal presumes that PostgreSQL is the best option for the foreseeable future, but nothing here is intended to preclude support for other database servers in futures releases. This proposal should not have any impact on QuickStart packaging or deployment. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CONNECTORS-54) A Filesystem output connector would be useful and would allow more complete unit tests
A Filesystem output connector would be useful and would allow more complete unit tests --- Key: CONNECTORS-54 URL: https://issues.apache.org/jira/browse/CONNECTORS-54 Project: Lucene Connector Framework Issue Type: Improvement Reporter: Karl Wright Right now, the unit tests are limited because there is no way to check that the "indexed" files actually do get indexed. The addition of a filesystem output connector would allow more complete tests to be constructed. In addition, such a connector might well be useful in its own right. The connector would need to convert URI's into relative file paths, but other than that there's really nothing very tricky about it. Configuration information is minimal; just the root path of the output is all that's needed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (CONNECTORS-52) Update documentation to reflect changes to Solr Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-52. --- Assignee: Karl Wright Resolution: Fixed Documentation and screen shots have been now updated. > Update documentation to reflect changes to Solr Connector > - > > Key: CONNECTORS-52 > URL: https://issues.apache.org/jira/browse/CONNECTORS-52 > Project: Lucene Connector Framework > Issue Type: Bug > Components: Documentation >Reporter: Karl Wright >Assignee: Karl Wright > > The Solr Connector has sprouted various new tabs and features lately. The > end-user documentation for it should be revamped to match the software. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.