[jira] [Commented] (CONNECTORS-55) Bundle database server with LCF packaged product

2011-04-02 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015021#comment-13015021
 ] 

Karl Wright commented on CONNECTORS-55:
---

The current status of this feature request is as follows:

- There is a Quick Start, which can run either with Derby or with PostgreSQL, 
depending on what you put in connectors.xml.
- We've done a lot of fixes to the Derby support, although there are two 
outstanding tickets that need to be resolved before it can be considered 
"real".  See CONNECTORS-100 and CONNECTORS-110.
- You can run the Quick Start with embedded Derby in a mode that allows 
external connections. "java -Dderby.drda.startNetworkServer=true -jar 
start.jar".  If you want a truly multiprocess ManifoldCF running, you will also 
need to set up file-based synchronization, as described in 
how-to-build-and-deploy.html.
- Instructions for how to run Quick Start with PostgreSQL are in the FAQ.
- I coded an HSQLDB implementation, but that's stalled because HSQLDB 
internally deadlocks.  See CONNECTORS-114.
- Nobody has looked at H2 in any depth yet.  Not clear if we still need to.



> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: ManifoldCF
>  Issue Type: Sub-task
>  Components: Installers
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-23 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891533#action_12891533
 ] 

Karl Wright commented on CONNECTORS-55:
---

MVCC is the feature that suggests greater concurrency (and, hence, greater 
performance).


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Installers
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-22 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891298#action_12891298
 ] 

Jack Krupansky commented on CONNECTORS-55:
--

I checked that H2 feature comparison table, but it did not suggest a great 
benefit of H2 for LCF. The footprint is a little smaller than Derby and of 
course a lot smaller than PostgreSQL. One area not in the table that could 
matter a lot is performance. Any quick thoughts on H2 performance relative to 
PostgreSQL and Derby?

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Installers
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-20 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890572#action_12890572
 ] 

Karl Wright commented on CONNECTORS-55:
---

H2 looks pretty impressive also, featurewise.
I actually already did a hsqldb driver for LCF but simply have not had a moment 
to even try it out.  It should not be hard to attempt one for H2.  Simple 
problems ought to manifest themselves rather quickly under the unit tests.  
Ideally, though, we need a large test to figure out what embedded database to 
choose.

The hardest methods of the driver to writer are the "interrogation" methods - 
e.g. finding the definitions for tables and indexes, finding out whether a user 
already exists, etc.  There's not enough standardization on how you do this 
across databases, and the way you do it is almost always not well documented 
either.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Installers
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-20 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890561#action_12890561
 ] 

Otis Gospodnetic commented on CONNECTORS-55:


Joining late.  I've used PG a lot at my previous startup.  Liked it, but it 
doesn't seem like the right DB to embed in a product like LCF.  I'm surprised 
nobody mentioned H2 (which I've never used, but thought that it's THE database 
to use in situations like this one).  Look at the feature table on 
http://www.h2database.com/html/main.html 


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Installers
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886732#action_12886732
 ] 

Robert Muir commented on CONNECTORS-55:
---

bq. By the way, hsqldb is apparently limited to a 16GB database (version 2.0). 
That's not very much.

No, its just a default I think: 
http://hsqldb.org/doc/2.0/guide/deployment-chapt.html

 ::= SET FILES SCALE 

Changes the scale factor for the .data file. The default scale is 8 and allows 
16GB of data storage capacity. The scale can be increased in order to increase 
the maximum data storage capacity. The scale values 8, 16, 32, 64 and 128 are 
allowed. Scale value 128 allows a maximum capacity of 256GB.



> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886730#action_12886730
 ] 

Karl Wright commented on CONNECTORS-55:
---

The quick-start even takes care of connector registration for you, so 
executecommand is not needed even then.  What you *don't* get to do is use the 
command-based API to control LCF; that's not going to work in the 
single-process model.

By the way, hsqldb is apparently limited to a 16GB database (version 2.0).  
That's not very much.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886724#action_12886724
 ] 

Jack Krupansky commented on CONNECTORS-55:
--

When Karl says "It *does* limit your ability to use other commands 
simultaneously" (referring to use of embedded Derby), he is referring to 
commands executed using the "executecommand" shell script, such as registering 
and unregistering connectors, which is something typically done once before 
starting the UI or once every blue moon when you want to support a new type of 
repository, but not done on as regular a basis as editing connections and jobs 
and running jobs. The java classes to execute those commands would be, by 
definition, outside of the LCF process.

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886721#action_12886721
 ] 

Karl Wright commented on CONNECTORS-55:
---

>>
That is exactly what I was leaning towards - but what kind of hobbled state are 
you in with derby? You said you have to run the db and ui one at a time or 
something? And that many sql queries don't work with derby - that has all been 
addressed already?

<<

The "hobbling" is that you can't sort on some columns in reports that you could 
sort on before when just Postgresql was involved.  Also, that no real 
large-scale perf tests have been done on Derby.  Also that you need to use 
"LIKE" %-based syntax instead of real regular expressions whenever you specify 
regular expressions in your reports.  The quick-start does not limit your 
simultaneous use of UI and crawler - it runs jetty as the app server within the 
same process.  It *does* limit your ability to use other commands 
simultaneously - but you should not need to do that in normal circumstances.

So "that " has indeed already been addressed.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886722#action_12886722
 ] 

Karl Wright commented on CONNECTORS-55:
---

>>
forcing the user to pick the right/acceptable release of PostgreSQL to install 
is error prone and a support headache
<<

Yup.  It is.  Problem is that products/versions get security fixes, CVE's, 
end-of-life notices, etc.  It is beyond the scope of LCF to try and control all 
that - we'd be buying a whole new level of support headache, believe me.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886720#action_12886720
 ] 

Jack Krupansky commented on CONNECTORS-55:
--

Karl notes that "we've had to mess with the stuffer query on pretty near every 
point release of Postgresql". Letting/forcing the user to pick the 
right/acceptable release of PostgreSQL to install is error prone and a support 
headache. I would argue that it is better for the LCF team to bundle the 
right/best release of PostgreSQL with LCF.

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886718#action_12886718
 ] 

Mark Miller commented on CONNECTORS-55:
---

If it's now that easy, then fantastic! The last quick start guide I saw was 
like many thousands of words, so it's nice to see it boiled down to like a 
dozen ;)

That is exactly what I was leaning towards - but what kind of hobbled state are 
you in with derby? You said you have to run the db and ui one at a time or 
something? And that many sql queries don't work with derby - that has all been 
addressed already?

Anyhow, you asked what my goal was for pushing getting a working java db - and 
this is exactly it.

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886717#action_12886717
 ] 

Karl Wright commented on CONNECTORS-55:
---

Mark,

If your concern is about installing LCF, read the Quick Start part of the 
build/deploy page.  You check out, build, and run.  Derby-based.  Nothing else 
to install.   Not hard really.



> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886715#action_12886715
 ] 

Mark Miller commented on CONNECTORS-55:
---

bq. Jack, for example, was extremely surprised to learn that embedded Derby 
would not allow more than one process to access the database at a time

You can switch to a mode that allows multiple connections - I've started with 
one and moved to the other in the past - its been too long so I'd have to go 
look, but its entirely possible to use very similar code and run in a server 
mode. 

bq. so that we could understand your true goal here.

To make trying out and installing LCF much easier - the bar on this thing as a 
user and a dev is just kind of ridiculous at the moment. I know a lot of 
improvements are being made, but having to install a separate postgres db just 
to get going - just to try LCF out - is something I'd like to see go away. A 
java db would allow us to get to a point where you download the thing and 
launch it - that would really get this project going. It will be helpful to 
attract both users and a community of devs. There are at least half a dozen or 
more committers that signed up for this project, but the bar is so high to even 
'do anything', that I suspect thats why most of them have yet to contribute 
even a comment - they don't have the time or motivation to go through that huge 
'running LCF' doc - its a bear just to skim, say nothing about go through the 
steps. In general, I like to think of all that work as something the computer 
can do for me ;)

Baby steps though - I'm just pushing in that direction - I know there would be 
a long, long path to get there.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886710#action_12886710
 ] 

Robert Muir commented on CONNECTORS-55:
---

bq. Why? Because Postgresql 8.3 became somehow incredibly sensitive to small 
changes in statistics and would cease to do the right thing very quickly as the 
database changed.

Well, perhaps it might be worth looking into, as hsqldb does less of these type 
of magic optimizations. For example, its up to you to order your join clause in 
a way that makes sense.
By the way, here is the link from the users guide where i pasted the text 
below, you can read more about it here (there is an example query that seems 
similar to yours, etc):

http://hsqldb.org/doc/2.0/guide/sqlgeneral-chapt.html#N107B1

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886703#action_12886703
 ] 

Karl Wright commented on CONNECTORS-55:
---

I should add that, even for Postgresql, we've had to mess with the stuffer 
query on pretty near every point release of Postgresql, to guarantee that it 
continues to meet the basic criteria.  The last change that we needed was to 
perform an ANALYZE before *every* time the query was run.  Why?  Because 
Postgresql 8.3 became somehow incredibly sensitive to small changes in 
statistics and would cease to do the right thing very quickly as the database 
changed.  We looked at this and discovered that it took a specific plan 
optimization path when it thought a particular statistic was 100%, and a 
totally different one when the statistic was anything less than 100%.   A bug?  
Well, no, just a sensitivity...  I guess one could call it a design flaw that 
nobody thought about what might happen if the statistics were slightly out of 
date.

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886704#action_12886704
 ] 

Robert Muir commented on CONNECTORS-55:
---

Karl, just quoting from the docs here. From my experience, i had no perf 
problems in the past getting hsqldb to use my indexes, but my queries were 
relatively simple (just lots of data).
 
bq. Because the queue may be very large, and this query may potentially return 
ALL records in the queue, the query plan MUST wind up reading directly out of 
the priority index, or the query simply will not work. It simply cannot afford 
to read 20 million records into memory and then sort them!

HyperSQL can use an index on an ORDER BY clause if all the columns in ORDER BY 
are in a single-column or multi-column index (in the exact order). This is 
important if there is a LIMIT n (or FETCH n ROWS ONLY) clause. In this 
situation, the use of index allows the query processor to access only the 
number of rows specified in the LIMIT clause, instead of building the whole 
result set, which can be huge. This also works for joined tables when the ORDER 
BY clause is on the columns of the first table in a join. Indexes are used in 
the same way when ORDER BY ... DESC is specified in the query. Note that unlike 
other RDBMS, HyperSQL does not create DESC indexes. It can use any index for 
ORDER BY ... DESC.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886701#action_12886701
 ] 

Karl Wright commented on CONNECTORS-55:
---

>>
Can you help me out and give me more ideas on what particular performance 
problems you are concerned about (e.g. query types or whatever) ?
<<

Hi Robert,
There are two major determinants of performance for LCF, under Postgresql at 
any rate.  The first is the performance of the queue stuffer query, and how 
that scales to when the queue is extremely large.  This is a complex query, but 
its basic form is:

SELECT  FROM  WHERE  AND NOT 
EXISTS() ORDER BY  
ASC LIMIT 

Because the queue may be very large, and this query may potentially return ALL 
records in the queue, the query plan MUST wind up reading directly out of the 
priority index, or the query simply will not work.  It simply cannot afford to 
read 20 million records into memory and then sort them!

The second place performance can be severely impacted is in how parallel writes 
can be.  In postgresql 7.4, for example, everything was single-threaded on 
writes.  This caused web crawling in particular to be poorly performing, 
because every typical web page has a significant number of links that must be 
entered in the queue, and single-threading that process cost some 4x to 10x 
over Postgresql 8.x, which allowed much more parallelism.

Hope this helps.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886696#action_12886696
 ] 

Robert Muir commented on CONNECTORS-55:
---

Karl, I agree with the goals you stated for the Derby impl. 

I just looked at some of the problems you had (such as multi-process support) 
and thought that perhaps these are specific to derby, not a general problem 
across java databases.

I've used hsqldb in both modes that it supports: embedded and client-server. 
Perhaps it might be a good fit, given that the embedded mode could be exploited 
for junit tests, but client-server mode
for production?

Can you help me out and give me more ideas on what particular performance 
problems you are concerned about (e.g. query types or whatever) ?

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886691#action_12886691
 ] 

Karl Wright commented on CONNECTORS-55:
---

Robert,

I'm not opposed to implementing support for hsqldb, but let's be clear on the 
goals here.

The initial goal for doing the Derby implementation was simply to be able to 
write unit tests, and to make Jack happy.  Later, because of the fact that 
Derby has an embedded JDBC mode of operation, it was possible to construct the 
LCF Quick-Start to use it.  So it made it possible to have an unzip-and-go 
solution.

What would the be goal of using hsqldb?   It seems to support an embedded mode, 
so it could certainly be used instead of Derby, wherever we are currently using 
Derby.  Since it fully supports MVCC it is certainly much closer to Postgresql 
in actual operation than Derby is, so chances are good that we'd find fewer 
issues in scaling than with Derby.  If this is the approach you are suggesting, 
I would suggest dropping support for Derby and simply replace it with Hsqldb.  
We'd leave the Derby implementation class around, of course, but we'd not tune 
against it or test against it.

FWIW, if hsqldb is sufficiently performant, I could foresee also dropping 
support for postgresql in the future, in the same way.  But that is yet to be 
proven.  And, indeed, that's what the problem is - there's no way to know in 
advance of doing the work how exactly things will pan out.  So if that's the 
true goal, we've got a fair bit of work to do before deciding whether Hsqldb or 
Derby or any other Java database can actually do what we need it to.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886688#action_12886688
 ] 

Robert Muir commented on CONNECTORS-55:
---

Karl, just an idea.

given the issues you had with Derby, instead of doing performance tuning or 
analysis, maybe Mark's idea is a good approach first.

I know i originally suggested hsqldb, i dont know anything about Derby except 
that its apache, but ive done real work with hsqldb.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886548#action_12886548
 ] 

Karl Wright commented on CONNECTORS-55:
---

Mark, it took most of a month to get Derby working, and to do it I needed to 
disable certain functionality in LCF.  No performance tuning or analysis has 
yet been done on Derby, and I would not be surprised if another month was 
required to complete that.  Point being that it is by no means ever a "plug and 
play" operation to switch databases - there are just way too many side effects 
(e.g. query A performs wonderfully on database X, but you need to use query B 
or you're dead on database Y).  Jack, for example, was extremely surprised to 
learn that embedded Derby would not allow more than one process to access the 
database at a time - and Jack was the one advocating most strongly for Derby 
support!

I therefore strongly suggest a cautious approach when considering Introducing 
additional databases.  Testing of any change also becomes much more difficult 
the more supported databases there are.  So, in my view, one really must ask, 
"What unmet scenario do you see that would demand support for this database?", 
before just going ahead and deciding to support whatever may be out there.  I 
realize this cautious approach is diametrically opposed to your stated goal of 
supporting "other java databases".  Perhaps you could clarify your request so 
that we could understand your true goal here.




> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886490#action_12886490
 ] 

Jack Krupansky commented on CONNECTORS-55:
--

I was using the term "install" loosely, not so much the way a typical package 
has a GUI wizard and lots of stuff going on, but more in the sense of raw Solr 
where you download, unzip, and files are in sub directories right where they 
need to be. In that sense, the theory is that a subset of PostgreSQL could be 
in a subdirectory.

Some enterprising vendor, such as Lucid Imagination, might want to have a fancy 
GUI install, but that would be beyond the scope of what I intended here.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886494#action_12886494
 ] 

Mark Miller commented on CONNECTORS-55:
---

All the more reason to get LCF working completely with other Java databases.

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886475#action_12886475
 ] 

Karl Wright commented on CONNECTORS-55:
---

Hi jack,
This seems to me to be beyond the scope of most open-source installers.  I've 
constructed installers involving Postgres before and the integration 
possibilities are very limited.  Furthermore, you would need a totally 
different installer for windows, debian, redhat, solaris, the mac, etc.  Many 
of these platforms do not work well with bundles but instead use a dependency 
model in any case.

--- original message ---
From: "ext Jack Krupansky (JIRA)" 
Subject: [jira] Created: (CONNECTORS-55) Bundle database server with LCF 
packaged product
Date: July 8, 2010
Time: 4:35:20  PM


Bundle database server with LCF packaged product


 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky


The current requirement that the user install and deploy a PostgreSQL server 
complicates the installation and deployment of LCF for the user. Installation 
and deployment of LCF should be as simple as Solr itself. QuickStart is great 
for the low-end and basic evaluation, but a comparable level of simplified 
installation and deployment is still needed for full-blown, high-end 
environments that need the full performance of a ProstgreSQL-class database 
server. So, PostgreSQL should be bundled with the packaged release of LCF so 
that installation and deployment of LCF will automatically install and deploy a 
subset of the full PostgreSQL distribution that is sufficient for the needs of 
LCF. Starting LCF, with or without the LCF UI, should automatically start the 
database server. Shutting down LCF should also shutdown the database server 
process.

A typical use case would be for a non-developer who is comfortable with Solr 
and simply wants to crawl documents from, for example, a SharePoint repository 
and feed them into Solr. QuickStart should work well for the low end or in the 
early stages of evaluation, but the user would prefer to evaluate "the real 
thing" with something resembling a production crawl of thousands of documents. 
Such a user might not be a hard-core developer or be comfortable fiddling with 
a lot of software components simply to do one conceptually simple operation.

It should still be possible for the user to supply database server settings to 
override the defaults, but the LCF package should have all of the best-practice 
settings deemed appropriate for use with LCF.

One downside is that installation and deployment will be platform-specific 
since there are multiple processes and PostgreSQL itself requires a 
platform-specific installation.

This proposal presumes that PostgreSQL is the best option for the foreseeable 
future, but nothing here is intended to preclude support for other database 
servers in futures releases.

This proposal should not have any impact on QuickStart packaging or deployment.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or i