[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886548#action_12886548
 ] 

Karl Wright commented on CONNECTORS-55:
---

Mark, it took most of a month to get Derby working, and to do it I needed to 
disable certain functionality in LCF.  No performance tuning or analysis has 
yet been done on Derby, and I would not be surprised if another month was 
required to complete that.  Point being that it is by no means ever a "plug and 
play" operation to switch databases - there are just way too many side effects 
(e.g. query A performs wonderfully on database X, but you need to use query B 
or you're dead on database Y).  Jack, for example, was extremely surprised to 
learn that embedded Derby would not allow more than one process to access the 
database at a time - and Jack was the one advocating most strongly for Derby 
support!

I therefore strongly suggest a cautious approach when considering Introducing 
additional databases.  Testing of any change also becomes much more difficult 
the more supported databases there are.  So, in my view, one really must ask, 
"What unmet scenario do you see that would demand support for this database?", 
before just going ahead and deciding to support whatever may be out there.  I 
realize this cautious approach is diametrically opposed to your stated goal of 
supporting "other java databases".  Perhaps you could clarify your request so 
that we could understand your true goal here.




> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-56) All features should be accessible through an API

2010-07-08 Thread Jack Krupansky (JIRA)
All features should be accessible through an API


 Key: CONNECTORS-56
 URL: https://issues.apache.org/jira/browse/CONNECTORS-56
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky


LCF consists of a full-featured crawling engine and a full-featured user 
interface to access the features of that engine, but some applications are 
better served with a full API that lets the application control the crawling 
engine, including creation and editing of connections and creation, editing, 
and control of jobs. Put simply, everything that a user can accomplish via the 
LCF UI should be doable through an LCF API. All LCF objects should be queryable 
through the API.

A primary use case is Solr applications which currently use Aperture for 
crawling, but would prefer the full-featured capabilities of LCF as a crawling 
engine over Aperture.

I do not wish to over-specify the API in this initial description, but I think 
the LCF API should probably be a traditional REST API., with some of the API 
elements specified via the context path, some parameters via URL query 
parameters, and complex, detailed structures as JSON (or similar.). The precise 
details of the API are beyond the scope of this initial description and will be 
added incrementally once the high-level approach to the API becomes reasonably 
settled.

A job status and event reporting scheme is also needed in conjunction with the 
LCF API. That requirement has already been captured as CONNECTORS-41.

The intention for the API is to create, edit, access, and control all of the 
objects managed by LCF. The main focus is on repositories, jobs, and status, 
and less about document-specific crawling information, but there may be some 
benefit to querying crawling status for individual documents as well.

Nothing in this proposal should in any way limit or constrain the features that 
will be available in the LCF UI. The intent is that LCF should continue to have 
a full-featured UI, but in addition to a full-featured API.

Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886490#action_12886490
 ] 

Jack Krupansky commented on CONNECTORS-55:
--

I was using the term "install" loosely, not so much the way a typical package 
has a GUI wizard and lots of stuff going on, but more in the sense of raw Solr 
where you download, unzip, and files are in sub directories right where they 
need to be. In that sense, the theory is that a subset of PostgreSQL could be 
in a subdirectory.

Some enterprising vendor, such as Lucid Imagination, might want to have a fancy 
GUI install, but that would be beyond the scope of what I intended here.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886494#action_12886494
 ] 

Mark Miller commented on CONNECTORS-55:
---

All the more reason to get LCF working completely with other Java databases.

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886475#action_12886475
 ] 

Karl Wright commented on CONNECTORS-55:
---

Hi jack,
This seems to me to be beyond the scope of most open-source installers.  I've 
constructed installers involving Postgres before and the integration 
possibilities are very limited.  Furthermore, you would need a totally 
different installer for windows, debian, redhat, solaris, the mac, etc.  Many 
of these platforms do not work well with bundles but instead use a dependency 
model in any case.

--- original message ---
From: "ext Jack Krupansky (JIRA)" 
Subject: [jira] Created: (CONNECTORS-55) Bundle database server with LCF 
packaged product
Date: July 8, 2010
Time: 4:35:20  PM


Bundle database server with LCF packaged product


 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky


The current requirement that the user install and deploy a PostgreSQL server 
complicates the installation and deployment of LCF for the user. Installation 
and deployment of LCF should be as simple as Solr itself. QuickStart is great 
for the low-end and basic evaluation, but a comparable level of simplified 
installation and deployment is still needed for full-blown, high-end 
environments that need the full performance of a ProstgreSQL-class database 
server. So, PostgreSQL should be bundled with the packaged release of LCF so 
that installation and deployment of LCF will automatically install and deploy a 
subset of the full PostgreSQL distribution that is sufficient for the needs of 
LCF. Starting LCF, with or without the LCF UI, should automatically start the 
database server. Shutting down LCF should also shutdown the database server 
process.

A typical use case would be for a non-developer who is comfortable with Solr 
and simply wants to crawl documents from, for example, a SharePoint repository 
and feed them into Solr. QuickStart should work well for the low end or in the 
early stages of evaluation, but the user would prefer to evaluate "the real 
thing" with something resembling a production crawl of thousands of documents. 
Such a user might not be a hard-core developer or be comfortable fiddling with 
a lot of software components simply to do one conceptually simple operation.

It should still be possible for the user to supply database server settings to 
override the defaults, but the LCF package should have all of the best-practice 
settings deemed appropriate for use with LCF.

One downside is that installation and deployment will be platform-specific 
since there are multiple processes and PostgreSQL itself requires a 
platform-specific installation.

This proposal presumes that PostgreSQL is the best option for the foreseeable 
future, but nothing here is intended to preclude support for other database 
servers in futures releases.

This proposal should not have any impact on QuickStart packaging or deployment.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or i

[jira] Created: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Jack Krupansky (JIRA)
Bundle database server with LCF packaged product


 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky


The current requirement that the user install and deploy a PostgreSQL server 
complicates the installation and deployment of LCF for the user. Installation 
and deployment of LCF should be as simple as Solr itself. QuickStart is great 
for the low-end and basic evaluation, but a comparable level of simplified 
installation and deployment is still needed for full-blown, high-end 
environments that need the full performance of a ProstgreSQL-class database 
server. So, PostgreSQL should be bundled with the packaged release of LCF so 
that installation and deployment of LCF will automatically install and deploy a 
subset of the full PostgreSQL distribution that is sufficient for the needs of 
LCF. Starting LCF, with or without the LCF UI, should automatically start the 
database server. Shutting down LCF should also shutdown the database server 
process.

A typical use case would be for a non-developer who is comfortable with Solr 
and simply wants to crawl documents from, for example, a SharePoint repository 
and feed them into Solr. QuickStart should work well for the low end or in the 
early stages of evaluation, but the user would prefer to evaluate "the real 
thing" with something resembling a production crawl of thousands of documents. 
Such a user might not be a hard-core developer or be comfortable fiddling with 
a lot of software components simply to do one conceptually simple operation.

It should still be possible for the user to supply database server settings to 
override the defaults, but the LCF package should have all of the best-practice 
settings deemed appropriate for use with LCF.

One downside is that installation and deployment will be platform-specific 
since there are multiple processes and PostgreSQL itself requires a 
platform-specific installation.

This proposal presumes that PostgreSQL is the best option for the foreseeable 
future, but nothing here is intended to preclude support for other database 
servers in futures releases.

This proposal should not have any impact on QuickStart packaging or deployment.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-54) A Filesystem output connector would be useful and would allow more complete unit tests

2010-07-08 Thread Karl Wright (JIRA)
A Filesystem output connector would be useful and would allow more complete 
unit  tests
---

 Key: CONNECTORS-54
 URL: https://issues.apache.org/jira/browse/CONNECTORS-54
 Project: Lucene Connector Framework
  Issue Type: Improvement
Reporter: Karl Wright


Right now, the unit tests are limited because there is no way to check that the 
"indexed" files actually do get indexed.  The addition of a filesystem output 
connector would allow more complete tests to be constructed.  In addition, such 
a connector might well be useful in its own right.

The connector would need to convert URI's into relative file paths, but other 
than that there's really nothing very tricky about it.  Configuration 
information is minimal; just the root path of the output is all that's needed.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-52) Update documentation to reflect changes to Solr Connector

2010-07-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-52.
---

  Assignee: Karl Wright
Resolution: Fixed

Documentation and screen shots have been now updated.


> Update documentation to reflect changes to Solr Connector
> -
>
> Key: CONNECTORS-52
> URL: https://issues.apache.org/jira/browse/CONNECTORS-52
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Karl Wright
>Assignee: Karl Wright
>
> The Solr Connector has sprouted various new tabs and features lately.  The 
> end-user documentation for it should be revamped to match the software.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.