from:"Karl Wright \(JIRA\)"

[jira] Created: (CONNECTORS-103) RSS connector: Have better initial default values for throttling

2010-09-07 Thread Karl Wright (JIRA)

RSS connector: Have better initial default values for throttling


 Key: CONNECTORS-103
 URL: https://issues.apache.org/jira/browse/CONNECTORS-103
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: RSS connector
Reporter: Karl Wright
Priority: Minor


When you first create an rss connector connection, the bandwidth tab should 
come prepopulated with the following values:

Max connections per server: 2
Max KB per second per server: 64
Max fetches per minute per server: 12

Too many casual users of ACF have been crawling without any throttling, and 
that's going to give ACF a bad name in the long run,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-101) File system connector would benefit by default crawling rules

2010-09-07 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-101.


Fix Version/s: LCF Release 0.5
   Resolution: Fixed

r993551.

By the way, the UI is really pretty bad for this connector also, so I may open 
a ticket to clean that up as well.


 File system connector would benefit by default crawling rules
 -

 Key: CONNECTORS-101
 URL: https://issues.apache.org/jira/browse/CONNECTORS-101
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor
 Fix For: LCF Release 0.5


 When you add a path to a file system connector job, it should automatically 
 put in rules that cause it to include all files and directories under that 
 path.  This makes it easier to use, and more easily demonstrable too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-105) File system connector UI no longer adheres to connector UI standards, needs to be updated

2010-09-07 Thread Karl Wright (JIRA)

File system connector UI no longer adheres to connector UI standards, needs to 
be updated
-

 Key: CONNECTORS-105
 URL: https://issues.apache.org/jira/browse/CONNECTORS-105
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Priority: Minor


The file system connector specification Paths tab no longer adheres to the 
prevailing connector standard, which suggests a table for rule list displays.  
The connector UI should be updated.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-105) File system connector UI no longer adheres to connector UI standards, needs to be updated

2010-09-07 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-105:
--

Assignee: Karl Wright

 File system connector UI no longer adheres to connector UI standards, needs 
 to be updated
 -

 Key: CONNECTORS-105
 URL: https://issues.apache.org/jira/browse/CONNECTORS-105
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor
 Fix For: LCF Release 0.5


 The file system connector specification Paths tab no longer adheres to the 
 prevailing connector standard, which suggests a table for rule list displays. 
  The connector UI should be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-105) File system connector UI no longer adheres to connector UI standards, needs to be updated

2010-09-07 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-105.


Fix Version/s: LCF Release 0.5
   Resolution: Fixed

r993565.


 File system connector UI no longer adheres to connector UI standards, needs 
 to be updated
 -

 Key: CONNECTORS-105
 URL: https://issues.apache.org/jira/browse/CONNECTORS-105
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor
 Fix For: LCF Release 0.5


 The file system connector specification Paths tab no longer adheres to the 
 prevailing connector standard, which suggests a table for rule list displays. 
  The connector UI should be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-41) Add hooks to output connectors for receiving event notifications, specifically job start, job end, etc.

2010-08-31 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904582#action_12904582
]

Karl Wright commented on CONNECTORS-41:
---

I looked at this in some detail yesterday. The prime implementation option is
to add notification methods to IOutputConnector, so that job events get
reported to the connector when the job is being terminated. The issue in this
case is going to be how exactly to handle ServiceInterruption exceptions that
occur at the time of the notification into the connector. This is not
hypothetical because in the Solr case a notification may well fail, or it may
take a very long time (many minutes). Usually when there is a possibility of
extended interaction it argues for an additional state in the database.

It looks like it will not be possible to delay the change of the job status,
since that takes place in a transaction. If the notification fails, the job
could otherwise be left in the running state, and a retry would naturally
occur until the commit succeeded. But that doesn't look possible given the
transaction structure.

An alternative (non-notification) method of handling a commit request would
require the commit to take place as part of the output connector's poll()
method. This is a little better to work with because the poll() method will
naturally retry in any case. The issue here is that there would be no
*guarantee* of a commit taking place at all, since it isn't part of the
connector contract that the connection must continue to exist for any period of
time, which I think would violate the spirit of this ticket.

If explicit notification takes place, we could just report any error, and
forget about it, rather than keeping the job alive for a retry. That, too,
would mean that a commit was not guaranteed to occur during the job's lifecycle.

The final alternative, which would seemingly work, would involve there being
two job shutdown states - one prior to notification, and the second after
notification. The first state would be entered based on the current shutdown
logic. The second state would be entered only after the notification had been
successful. Thus, the notification *could* be called more than once, if there
were errors, or if the crawler were shut down and restarted before the state
transition was completed. The extra state would also allow the job's
pre-notification status to be noted in the crawler ui.

Because of the potential time delay of a commit, it is probably best for the
first to second shutdown state transition to be handled by a separate thread,
or family of threads.

Add hooks to output connectors for receiving event notifications,
specifically job start, job end, etc.
---

Key: CONNECTORS-41
URL: https://issues.apache.org/jira/browse/CONNECTORS-41
Project: Apache Connectors Framework
Issue Type: Improvement
Components: Framework core
Reporter: Karl Wright
Priority: Minor

Currently there is no logic that informs an output connection of a job start,
end, deletion, or other activity. While this would seem to have little to do
with an output connector, this feature has been requested by Jack Krupansky
as a potential way of deciding when to tell Solr to commit documents, rather
than leave it up to Solr's configuration.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-41) Add hooks to output connectors for receiving event notifications, specifically job start, job end, etc.

2010-08-31 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904611#action_12904611
 ] 

Karl Wright commented on CONNECTORS-41:
---

Does it makes no sense to create an entirely new kind of connector just for 
notifications of this sort?  So when you create a job you select THREE 
different kinds of connection (repository, output, and notification)?  That 
seems like supreme overkill to me, and I can well argue that this kind of 
notification really is only useful to an output connection in any case.


 Add hooks to output connectors for receiving event notifications, 
 specifically job start, job end, etc.
 ---

 Key: CONNECTORS-41
 URL: https://issues.apache.org/jira/browse/CONNECTORS-41
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright
Priority: Minor

 Currently there is no logic that informs an output connection of a job start, 
 end, deletion, or other activity.  While this would seem to have little to do 
 with an output connector, this feature has been requested by Jack Krupansky 
 as a potential way of deciding when to tell Solr to commit documents, rather 
 than leave it up to Solr's configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-41) Add hooks to output connectors for receiving event notifications, specifically job start, job end, etc.

2010-08-31 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904622#action_12904622
 ] 

Karl Wright commented on CONNECTORS-41:
---

I think we're discussing two entirely distinct features here.

Feature 1: Let the output connector know that a job is finished, so that it can 
flush whatever internal buffering etc. it has been doing (e.g. tell solr to 
commit).
Feature 2: Provide some general way of monitoring the progress of jobs etc.

Feature 2 is already met by the API, in my opinion.  It's a polling-style 
fulfillment of the requirement, but it does exist.  There doesn't seem to me to 
yet be a requirement that a notification-style API be provided also, despite 
the stated use case.
Feature 1 is what I consider to be the use case for this current ticket.


 Add hooks to output connectors for receiving event notifications, 
 specifically job start, job end, etc.
 ---

 Key: CONNECTORS-41
 URL: https://issues.apache.org/jira/browse/CONNECTORS-41
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright
Priority: Minor

 Currently there is no logic that informs an output connection of a job start, 
 end, deletion, or other activity.  While this would seem to have little to do 
 with an output connector, this feature has been requested by Jack Krupansky 
 as a potential way of deciding when to tell Solr to commit documents, rather 
 than leave it up to Solr's configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-41) Add hooks to output connectors for receiving event notifications, specifically job start, job end, etc.

2010-08-31 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-41:
-

Assignee: Karl Wright

 Add hooks to output connectors for receiving event notifications, 
 specifically job start, job end, etc.
 ---

 Key: CONNECTORS-41
 URL: https://issues.apache.org/jira/browse/CONNECTORS-41
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor

 Currently there is no logic that informs an output connection of a job start, 
 end, deletion, or other activity.  While this would seem to have little to do 
 with an output connector, this feature has been requested by Jack Krupansky 
 as a potential way of deciding when to tell Solr to commit documents, rather 
 than leave it up to Solr's configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-57) Solr output connector option to commit at end of job, by default

2010-08-31 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904736#action_12904736
 ] 

Karl Wright commented on CONNECTORS-57:
---

I added unconditional commit support to the Solr connector as part of ticket 
CONNECTORS-41.  The ability to turn it off and on cannot be done per job based 
on that implementation, but could readily be specified per Solr connection.  
This makes more sense to me anyway, since what will control whether you want 
this feature on or not is your solr configuration, and that's not going to 
change per job.



 Solr output connector option to commit at end of job, by default
 

 Key: CONNECTORS-57
 URL: https://issues.apache.org/jira/browse/CONNECTORS-57
 Project: Apache Connectors Framework
  Issue Type: Sub-task
  Components: Lucene/SOLR connector
Reporter: Jack Krupansky

 By default, Solr will eventually commit documents that have been submitted to 
 the Solr Cell interface, but the time lag can confuse and annoy people. 
 Although commit strategy is a difficult issue in general, an option in LCF to 
 automatically commit at the end of a job, by default, would eliminate a lot 
 of potential confusion and generally be close to what the user needs.
 The desired feature is that there be an option to commit for each job that 
 uses the Solr output connector. This option would default to on (or a 
 different setting based on some global configuration setting), but the user 
 may turn it off if commit is only desired upon completion of some jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-30 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904205#action_12904205
]

Karl Wright commented on CONNECTORS-92:
---

Jettro,

If you are using maven to start jetty directly, it will not work. You are
missing the jetty runner, which only starts jetty at the end of a number of
steps, including creating the database properly and setting up the schema and
registering the connectors. Then, the crawler itself is started as a separate
thread.

It took me many weeks to get everything to work properly using jetty. Changing
all this stuff around does not seem either warranted or useful at this time. I
strongly recommend that you concentrate on using maven to actually build the
software, and not try to re-engineer the example right now.

Move from ant to maven or other build system with decent library management
---

Key: CONNECTORS-92
URL: https://issues.apache.org/jira/browse/CONNECTORS-92
Project: Apache Connectors Framework
Issue Type: Wish
Components: Build
Reporter: Jettro Coenradie
Attachments: maven-poms-problem-starting-jetty-and-derby.patch,
move-to-maven-acf-framework.patch, Screen shot 2010-08-23 at 16.31.07.png

I am looking at the current project structure. If we want to make another
build tool available I think we need to change the directory structure. I
tried to place a suggestion in an image. Can you please have a look at it. If
we agree that this is a good way to go, than I will continue to work on a
patch. Which might be a bit hard with all these changing directories, but
I'll do my best to at least get an idea whether it would be working.
So I have three questions:
- Do you want to move to maven or put maven next to ant?
- Do you prefer another build mechanism [ant with ivy, gradle, maven3]
- Do you have an idea about the amount of scripts that need to be changed if
we change the project structure
The image of a possible project layout (that is based on the maven standards)
is attached to the issue

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-30 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904209#action_12904209
]

Karl Wright commented on CONNECTORS-92:
---

I've had a cursory glance at the pom files and they all look reasonable. I'm
going to play around with this a bit locally to see how it behaves, and then if
all seems OK I am happy to commit those.

Move from ant to maven or other build system with decent library management
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-30 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904219#action_12904219
]

Karl Wright commented on CONNECTORS-92:
---

bq. I am still thinking about why this is so hard. Would be nice to have
something like a servlet or filter that initializes everything that you do in
your special runner now.

The issues have to do with these facts:

- Embedded derby is single-process. You cannot run more than one process
against a given database at a given time.
- ACF supports both single-process and multi-process models, but IF you're
going to use single-process, you need to have a main class that starts up all
the threads that would otherwise be different processes. That's what
jetty-runner does, in part.

So, obviously, something like jetty-runner needs to exist if you are going to
use derby. I don't think maven magic will suffice to replace the code that
does that.

Furthermore, I think trying to get maven to do this for us is overkill. I'm
open to suggestions, but I still don't think you need to solve this problem in
order to have ACF be built effectively by maven.

What I think we need to build at the framework level are all the jars and wars
(which it looks like you have pretty well specified), PLUS a start.jar (which I
didn't see anywhere - did I miss it?). Then your example execution will not be
a jetty instance per se, but will simply fire off the equivalent of java
-jar start.jar. I can't believe there isn't a maven plugin for that. This,
of course, must happen at the modules level, because no connectors will be
available at the framework level.

Move from ant to maven or other build system with decent library management
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-30 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904324#action_12904324
 ] 

Karl Wright commented on CONNECTORS-92:
---

Another way you can determine what's supposed to be a dependency is to look at 
the start.jar produced by the ant build:

attribute name=Class-Path value=lib/commons-codec.jar 
lib/commons-collections.jar lib/commons-el.jar lib/commons-fileupload.jar 
lib/commons-httpclient-acf.jar lib/commons-io.jar lib/commons-logging.jar 
lib/derbyclient.jar lib/derby.jar lib/derbyLocale_cs.jar 
lib/derbyLocale_de_DE.jar lib/derbyLocale_es.jar lib/derbyLocale_fr.jar 
lib/derbyLocale_hu.jar lib/derbyLocale_it.jar lib/derbyLocale_ja_JP.jar 
lib/derbyLocale_ko_KR.jar lib/derbyLocale_pl.jar lib/derbyLocale_pt_BR.jar 
lib/derbyLocale_ru.jar lib/derbyLocale_zh_CN.jar lib/derbyLocale_zh_TW.jar 
lib/derbynet.jar lib/derbyrun.jar lib/derbytools.jar lib/eclipse-ecj.jar 
lib/jasper-6.0.24.jar lib/jasper-el-6.0.24.jar lib/jdbcpool-0.99.jar 
lib/jetty-6.1.22.jar lib/jetty-util-6.1.22.jar 
lib/jsp-api-2.1-glassfish-9.1.1.B60.25.p2.jar lib/json.jar lib/acf-agents.jar 
lib/acf-core.jar lib/acf-jetty-runner.jar lib/acf-pull-agent.jar 
lib/acf-ui-core.jar lib/log4j-1.2.jar lib/postgresql.jar lib/serializer.jar 
lib/servlet-api-2.5-20081211.jar lib/tomcat-juli-6.0.24.jar lib/xalan2.jar 
lib/xercesImpl-lcf.jar lib/xml-apis.jar/

Note that commons-httpclient-acf.jar is our own version of commons-httpclient, 
and must therefore NOT be an external dependency.


 Move from ant to maven or other build system with decent library management
 ---

 Key: CONNECTORS-92
 URL: https://issues.apache.org/jira/browse/CONNECTORS-92
 Project: Apache Connectors Framework
  Issue Type: Wish
  Components: Build
Reporter: Jettro Coenradie
Assignee: Karl Wright
 Attachments: maven-poms-including-start-jar.patch, 
 maven-poms-problem-starting-jetty-and-derby.patch, 
 move-to-maven-acf-framework.patch, Screen shot 2010-08-23 at 16.31.07.png


 I am looking at the current project structure. If we want to make another 
 build tool available I think we need to change the directory structure. I 
 tried to place a suggestion in an image. Can you please have a look at it. If 
 we agree that this is a good way to go, than I will continue to work on a 
 patch. Which might be a bit hard with all these changing directories, but 
 I'll do my best to at least get an idea whether it would be working.
 So I have three questions:
 - Do you want to move to maven or put maven next to ant?
 - Do you prefer another build mechanism [ant with ivy, gradle, maven3]
 - Do you have an idea about the amount of scripts that need to be changed if 
 we change the project structure
 The image of a possible project layout (that is based on the maven standards) 
 is attached to the issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-27 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903356#action_12903356
]

Karl Wright commented on CONNECTORS-92:
---

I am now ready to commit the connectors reorganization also, once I hear back.

Move from ant to maven or other build system with decent library management
---

Key: CONNECTORS-92
URL: https://issues.apache.org/jira/browse/CONNECTORS-92
Project: Apache Connectors Framework
Issue Type: Wish
Components: Build
Reporter: Jettro Coenradie
Attachments: move-to-maven-acf-framework.patch, Screen shot
2010-08-23 at 16.31.07.png

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-27 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903465#action_12903465
]

Karl Wright commented on CONNECTORS-92:
---

I should also clarify that, to me, servlet is not just a single class in any
case, but a body of functionality responsible for fielding web requests. So I
think the servlet label is quite accurate. Others, of course, doubtless have
different definitions. ;-)

Move from ant to maven or other build system with decent library management
---

Key: CONNECTORS-92
URL: https://issues.apache.org/jira/browse/CONNECTORS-92
Project: Apache Connectors Framework
Issue Type: Wish
Components: Build
Reporter: Jettro Coenradie
Attachments: move-to-maven-acf-framework.patch, Screen shot
2010-08-23 at 16.31.07.png

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-27 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903463#action_12903463
]

Karl Wright commented on CONNECTORS-92:
---

bq. Wouldn't it be better to rename the *-servlet into something like war or
web. There will probably be more things in there than a servlet.

No, really, there's just the servlet.
All that I did was break the authority service into a separate web application
and jar file. Both of these were built before under the heading of
authority-service, but since we're getting rigorous, I separated out the
targets. Did the same thing for the api - there's now a servlet, and a
service, one yields a jar, the other a war (which includes the jar).

Move from ant to maven or other build system with decent library management
---

Key: CONNECTORS-92
URL: https://issues.apache.org/jira/browse/CONNECTORS-92
Project: Apache Connectors Framework
Issue Type: Wish
Components: Build
Reporter: Jettro Coenradie
Attachments: move-to-maven-acf-framework.patch, Screen shot
2010-08-23 at 16.31.07.png

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-27 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903477#action_12903477
]

Karl Wright commented on CONNECTORS-92:
---

No, the directories ending in -service produce wars. Those ending in -servlet
produce a jar.

Move from ant to maven or other build system with decent library management
---

Key: CONNECTORS-92
URL: https://issues.apache.org/jira/browse/CONNECTORS-92
Project: Apache Connectors Framework
Issue Type: Wish
Components: Build
Reporter: Jettro Coenradie
Attachments: move-to-maven-acf-framework.patch, Screen shot
2010-08-23 at 16.31.07.png

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-08-27 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903556#action_12903556
 ] 

Karl Wright commented on CONNECTORS-98:
---

Jack, if you intend to work on this, can you give me an idea of roughly when I 
can expect to see something?  It looks like there's going to be another 
renaming exercise, and I'd rather not step too hard on ongoing work, so please 
us apprised of your schedule/progress.


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-26 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902805#action_12902805
]

Karl Wright commented on CONNECTORS-92:
---

It looks to me like you adopted the one-jar-per-maven-script approach, with no
coalescing of jars, but instead introducing /src/main under each of the
subtargets within framework. I'd really like instead to make our job easier by
at least combining the framework main jars together into one target first,
along the lines I described above. I'd also like to get a sense of the overall
picture before proceeding, so can we discuss what individual maven targets
there are that you are proposing, and what each of them is, before we undertake
any changes of this kind? The individual connector ones are obvious, but I'm
concerned about stuff like the integration tests and the quick-start jetty
package. How do you cover those in a maven build?

Move from ant to maven or other build system with decent library management
---

Key: CONNECTORS-92
URL: https://issues.apache.org/jira/browse/CONNECTORS-92
Project: Apache Connectors Framework
Issue Type: Wish
Components: Build
Reporter: Jettro Coenradie
Attachments: move-to-maven-acf-framework.patch, Screen shot
2010-08-23 at 16.31.07.png

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-97) Web connector session authentication fails for some sites due to cookies httpclient says are illegal, but browsers accept

2010-08-26 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-97?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902880#action_12902880
]

Karl Wright commented on CONNECTORS-97:
---

It turns out that our version of httpclient does not allow this to be
configured. The code in question can be found in the validate() methods in
commons-httpclient-3x/src/java/org/apache/httpclient/cookie/:

CookieSpecBase.java

and

RFC2965Spec.java

Thus, fixing this problem will require adding a configuration parameter to our
httpclient version, as well as changing the web connector to set this
configuration parameter appropriately.

Web connector session authentication fails for some sites due to cookies
httpclient says are illegal, but browsers accept
-

Key: CONNECTORS-97
URL: https://issues.apache.org/jira/browse/CONNECTORS-97
Project: Apache Connectors Framework
Issue Type: Bug
Components: Web connector
Reporter: Karl Wright

While trying to set up session authentication for the site
http://www.ppdm.org, I ran into authentication problems that resulted from
httpclient rejecting cookies:
Cookie rejected:
ppdm_forum_data=a%3A2%3A%7Bs%3A11%3A%22autologinid%22%3Bs%3A0%3A%22%22%3Bs%3A6%3A%22userid%22%3Bi%3A-1%3B%7D.
Illegal path attribute /forums. Path of origin: /ba/login/login
Cookie rejected: ppdm_forum_sid=338b5f5f0887ab4c2499948fc05daac8. Illegal
path attribute /forums. Path of origin: /ba/login/login
Cookie rejected:
ppdm_forum_data=a%3A2%3A%7Bs%3A11%3A%22autologinid%22%3Bs%3A32%3A%2266a33ac80119bdcf7a1129f78de857a1%22%3Bs%3A6%3A%22userid%22%3Bs%3A4%3A%221346%22%3B%7D.
Illegal path attribute /forums. Path of origin: /ba/login/login
Cookie rejected: ppdm_forum_sid=3c36d20f96423b2de2d215a33b304e18. Illegal
path attribute /forums. Path of origin: /ba/login/login
And yet, FireFox and IE have no trouble with these. I suspect that there
must be a configuration setting for httpclient that will fix this problem -
and if there isn't, we need to add one and set it appropriately in the web
connector code.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-97) Web connector session authentication fails for some sites due to cookies httpclient says are illegal, but browsers accept

2010-08-26 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-97?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright resolved CONNECTORS-97.
---

Fix Version/s: LCF Release 0.5
Resolution: Fixed

r989844-r989847

Web connector session authentication fails for some sites due to cookies
httpclient says are illegal, but browsers accept
-

Key: CONNECTORS-97
URL: https://issues.apache.org/jira/browse/CONNECTORS-97
Project: Apache Connectors Framework
Issue Type: Bug
Components: Web connector
Reporter: Karl Wright
Assignee: Karl Wright
Fix For: LCF Release 0.5

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-97) Web connector session authentication fails for some sites due to cookies httpclient says are illegal, but browsers accept

2010-08-26 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-97?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright reassigned CONNECTORS-97:
-

Assignee: Karl Wright

Web connector session authentication fails for some sites due to cookies
httpclient says are illegal, but browsers accept
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-26 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903151#action_12903151
]

Karl Wright commented on CONNECTORS-92:
---

I rearranged the framework part of the tree to what I believe will satisfy
maven. The rest of the tree I will cover in a subsequent check-in, provided I
got this part right. Can you verify that the current tree is correct, and can
you upload a new maven patch based on the new tree?

Move from ant to maven or other build system with decent library management
---

Key: CONNECTORS-92
URL: https://issues.apache.org/jira/browse/CONNECTORS-92
Project: Apache Connectors Framework
Issue Type: Wish
Components: Build
Reporter: Jettro Coenradie
Attachments: move-to-maven-acf-framework.patch, Screen shot
2010-08-23 at 16.31.07.png

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-93) add contributors to CHANGES.txt

2010-08-25 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902436#action_12902436
 ] 

Karl Wright commented on CONNECTORS-93:
---

So I hear that there has been a lot of recent discussion about our status 
change at gene...@incubator.apache.org, which I was unaware of.  I was not 
subscribed to that list, and had accepted Grant's assessment of our status 
change.  We'll have to see where it leads now.


 add contributors to CHANGES.txt
 ---

 Key: CONNECTORS-93
 URL: https://issues.apache.org/jira/browse/CONNECTORS-93
 Project: Apache Connectors Framework
  Issue Type: Task
  Components: Documentation
Reporter: Robert Muir
 Attachments: CONNECTORS-93.patch


 As mentioned on the connectors-dev@ list (change the format of CHANGES.txt), 
 I propose we modify CHANGES.txt
 to give credit to contributors who have given bug reports, comments, patches, 
 etc.
 I'll volunteer to go thru the mail archives and jira issues that are marked 
 'resolved' and upload a patch here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-24 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901777#action_12901777
]

Karl Wright commented on CONNECTORS-92:
---

bq. As a response to the remark from Karl
(1) Breaking up modules and putting pieces of that all over the place
I do not think they are all over the place, maybe I am thinking wrong about the
modules part, but for me modules is not really clear. At the moment we have
documentation, modules and tests. I suggest a slightly more separated mode
with: documentation, integration-tests, framework, connectors and environment.
The only change is to move some stuff from modules into a new part environment
en move the other parts of modules one level up.

Each thing under modules is something you'd want to build separately, which is
why I chose the arrangement in the first place. If I were deploying these on a
debian system, each would be its own package. That is, each connector would
necessarily be its own package, as would mod-authz-annotate, and
java-environment. Indeed, java-environment was originally a debian package
that was part of the LCF software grant and has not been modified even to
build, because it in effect represented a Debian java deployment framework
rather than actual code. Same thing with postgres-config, except that was for
postgresql configuration under Debian. Furthermore, mod-authz-annotate is C,
and probably cannot be built under maven (or do I have that wrong?)

Therefore, for a maven build we should plan on building the following as
SEPARATE maven deliverables/targets:
- (1) Each connector
- (2) The framework
If there is a way to build C stuff under maven, then this too should be a maven
deliverable/target:
- (3) mod-authz-annotate
These should exist in the tree but be ignored for now, since they are not
applicable to maven at all:
- (4) java-environment
- (5) postgres-config

bq. (2) Taking jetty-runner out of framework
I do not think that Jetty is part of your framework, you create war files and
give the option for an easy start using Jetty. But maybe I am wrong.

I set the jetty example and runner up so that they do not have explicit
dependencies on any individual connectors, and thus they're built as part of
the framework, which they DO have a dependency on. A case could be made for
having these be separated into their own module-level component, in which case
they'd also be their own maven deliverable.

bq. (3) Introducing a src directory under each of the framework components
At the moment when running ant. You get a lot of folders of which it is not
always easy to understand whether they are original source folders or not. That
is why maven comes with a clear separation of src, generated-source and target
for other generated content. To my opinion this makes it easier to see what is
under version control and what is not.

Check the maven page for more explanation.
http://maven.apache.org/guides/introduction/introduction-to-the-standard-directory-layout.html

I will read the page. It seems to me that we'd need to agree what the maven
deliverables would be before we can decide where the src directory goes. If
the framework is a component all by itself (and I think it should be), then
naturally the structure would be modules/framework/src/... instead. Does
maven allow multiple jars in a deliverable? That would be a necessary
condition.

bq. (4) Moving the tests so far away from the code they are related to
I am not sure if I was clear enough on this. In the original code base a test
folder is available next to modules. For unit tests I would keep them as close
as possible to the source code. Therefore we have the src/main and src/test in
the same module. The integration tests are another beast. Usually a lot of
environmental setup needs to be done, they take longer, and you might want to
store them in a different folder so you can run them all at once. Another
option would be to add them next to the unit tests in a different folder
[src/main, src/test/ and src/integration-test] or use a different naming
scheme. **Test.java and **IntegrationTest.java That way you can folder them out
as well and use the maven lifecycle to decide whether to run unit test or both
unit and integration tests.

As of right now, there are three kinds of tests in the system: Unit tests
(which are checked in in the module they are to test), integration unit tests
(which are checked in at the modules level), and full integration tests that
are a legacy of the LCF code grant (which are checked in in the tests
directory above the modules level). The full integration tests are not
executed but were meant to furnish the rudiments of a test plan,as well as
useful bits for manipulating repositories themselves during test processes, and
thus must be considered reference material at this time. The

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-24 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901811#action_12901811
]

Karl Wright commented on CONNECTORS-92:
---

bq. Web projects are no problem at all. You can even have dependencies between
webproject. Althought I would try to make dependencies on jars only.

The question is, who would *want* to depend on any individual ACF war files?
If there's a need, then fine, but I don't see one here. The only use case I
can come up with for anybody depending on ACF is on the main framework jars,
which could be consolidated into one jar quite readily. I would therefore
propose breaking up modules/framework into two pieces:
modules/framework-core, and modules/framework, or some such.
framework-core would contain what's currently in framework/core,
framework/ui-core, framework/agents, and framework/pull-agent, and would have
both an ant build and a maven build that wraps it. framework would contain
crawler-ui, api', authorityservice, and the jetty stuff, and would have a
straight ant build and an ant-with-ivy build wrapping that. Each connector
would have an ant build and a wrapping ant-ivy build also.

Thoughts?

Move from ant to maven or other build system with decent library management
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-23 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901309#action_12901309
 ] 

Karl Wright commented on CONNECTORS-91:
---

This patch file worked properly.
Since the automated tests do not exercise the commands, it would be good to set 
up a database instance from scratch using the changed code.  If you have 
already done this, please let me know and I will go ahead and commit the 
changes.


 Making the initialization commands more useable
 ---

 Key: CONNECTORS-91
 URL: https://issues.apache.org/jira/browse/CONNECTORS-91
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jettro Coenradie
 Fix For: LCF Release 0.5

 Attachments: change_commands.patch


 At the moment LCF comes with some classes that can be used to run command 
 line to interact with the system. Examples are DBCreate, DBDrop and 
 LockClean. I wanted to create a class that rebuilds my complete environment. 
 So dropping a database, creating a database, cleaning the synch folder, 
 registering agents, etc. Due to the structure of the classes with all the 
 logic in the main method, I could not easily reuse these classes. In the 
 patch I submit with issue I have refactored the current solution in a better 
 reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-23 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901312#action_12901312
]

Karl Wright commented on CONNECTORS-91:
---

Another thing I had not noticed before is that this patch removes all stderr
success confirmation messages for those folks who use the commands, and
replaces them with log output. The log output is perfectly fine, but removing
the feedback that the command was successful is, I think, not great. If the
log were going to stderr typically that would be OK, but it typically is not,
so I think you are going to want to do both. You would, obviously, want to do
the stderr output within the main() method.

Would it be possible to fix that up before I commit this?

Making the initialization commands more useable
---

Key: CONNECTORS-91
URL: https://issues.apache.org/jira/browse/CONNECTORS-91
Project: Apache Connectors Framework
Issue Type: Improvement
Components: Framework core
Reporter: Jettro Coenradie
Fix For: LCF Release 0.5

Attachments: change_commands.patch

At the moment LCF comes with some classes that can be used to run command
line to interact with the system. Examples are DBCreate, DBDrop and
LockClean. I wanted to create a class that rebuilds my complete environment.
So dropping a database, creating a database, cleaning the synch folder,
registering agents, etc. Due to the structure of the classes with all the
logic in the main method, I could not easily reuse these classes. In the
patch I submit with issue I have refactored the current solution in a better
reuseable solution that can still be called command line.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-23 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901336#action_12901336
]

Karl Wright commented on CONNECTORS-91:
---

I looked at this. The patch seems correct for some classes, but for others it
is clearly incorrect, e.g. SynchronizeAll:

{
System.err.println(Usage: SynchronizeAll);
System.exit(1);
+ System.err.println(Successfully synchronized all agents);
}

Can you review your change for accuracy please?

Also, responding to the logging change - the log settings are global, and we
are trying for the least amount of setup work necessary to achieve a functional
system. Clearly, all log messages to stderr is not going to be reasonable for
people doing real crawls, so we'd need some way to segregate command output in
order to direct it differently than everything else, which implies at the least
you'd want a different logger, and then you'd also want to revise the
documented log4j properties, if you think we should go that route.

Re: testing. The testing you've done so far is best we can do at the moment,
unless you'd also like to write some unit tests. I don't think this would be
terribly difficult, but once again it would be time consuming. ;-)

Making the initialization commands more useable
---

Attachments: change_commands.patch,
change_commands_with_system_err_println.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-91) Making the initialization commands more useable

2010-08-23 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-91:
-

Assignee: Karl Wright

 Making the initialization commands more useable
 ---

 Key: CONNECTORS-91
 URL: https://issues.apache.org/jira/browse/CONNECTORS-91
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jettro Coenradie
Assignee: Karl Wright
 Fix For: LCF Release 0.5

 Attachments: change_commands.patch, 
 change_commands_with_system_err_println.patch, 
 change_commands_with_system_err_println_v2.patch


 At the moment LCF comes with some classes that can be used to run command 
 line to interact with the system. Examples are DBCreate, DBDrop and 
 LockClean. I wanted to create a class that rebuilds my complete environment. 
 So dropping a database, creating a database, cleaning the synch folder, 
 registering agents, etc. Due to the structure of the classes with all the 
 logic in the main method, I could not easily reuse these classes. In the 
 patch I submit with issue I have refactored the current solution in a better 
 reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-91) Making the initialization commands more useable

2010-08-23 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-91.
---

Resolution: Fixed

Patch committed.
r988101.


 Making the initialization commands more useable
 ---

 Key: CONNECTORS-91
 URL: https://issues.apache.org/jira/browse/CONNECTORS-91
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jettro Coenradie
Assignee: Karl Wright
 Fix For: LCF Release 0.5

 Attachments: change_commands.patch, 
 change_commands_with_system_err_println.patch, 
 change_commands_with_system_err_println_v2.patch


 At the moment LCF comes with some classes that can be used to run command 
 line to interact with the system. Examples are DBCreate, DBDrop and 
 LockClean. I wanted to create a class that rebuilds my complete environment. 
 So dropping a database, creating a database, cleaning the synch folder, 
 registering agents, etc. Due to the structure of the classes with all the 
 logic in the main method, I could not easily reuse these classes. In the 
 patch I submit with issue I have refactored the current solution in a better 
 reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-23 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901432#action_12901432
 ] 

Karl Wright commented on CONNECTORS-92:
---

This proposed change has a number of features I don't  understand the reasons 
for:

(1) Breaking up modules and putting pieces of that all over the place
(2) Taking jetty-runner out of framework
(3) Introducing a src directory under each of the framework components
(4) Moving the tests so far away from the code they are related to

Can you describe your logic for this reorganization?


 Move from ant to maven or other build system with decent library management
 ---

 Key: CONNECTORS-92
 URL: https://issues.apache.org/jira/browse/CONNECTORS-92
 Project: Apache Connectors Framework
  Issue Type: Wish
  Components: Build
Reporter: Jettro Coenradie
 Attachments: Screen shot 2010-08-23 at 16.31.07.png


 I am looking at the current project structure. If we want to make another 
 build tool available I think we need to change the directory structure. I 
 tried to place a suggestion in an image. Can you please have a look at it. If 
 we agree that this is a good way to go, than I will continue to work on a 
 patch. Which might be a bit hard with all these changing directories, but 
 I'll do my best to at least get an idea whether it would be working.
 So I have three questions:
 - Do you want to move to maven or put maven next to ant?
 - Do you prefer another build mechanism [ant with ivy, gradle, maven3]
 - Do you have an idea about the amount of scripts that need to be changed if 
 we change the project structure
 The image of a possible project layout (that is based on the maven standards) 
 is attached to the issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2010-08-23 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901436#action_12901436
]

Karl Wright commented on CONNECTORS-92:
---

Re: build preferences

Continuing to have an ant build is actually pretty important for some modes of
delivery. I'm specifically thinking of debian and Ubuntu packaging here.
Maven does not work well with these packaging schemes because it's too
all-encompassing. We therefore need a way of doing builds locally, without
pulling things down from a mirror.

My original thought was that we'd have multiple layers - ant being the most
basic, with a maven wrapper available to pull down what the ant build needed,
and have the maven build call ant underneath. I don't know how realistic that
is, but it does solve all the problems if it can be done that way.

Move from ant to maven or other build system with decent library management
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-16 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12898920#action_12898920
 ] 

Karl Wright commented on CONNECTORS-91:
---

It looks like this is simply using class-inheritance to separate out common 
functionality.  As such, I'm in favor of including this contribution.  Are 
there any subtleties I am missing?


 Making the initialization commands more useable
 ---

 Key: CONNECTORS-91
 URL: https://issues.apache.org/jira/browse/CONNECTORS-91
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jettro Coenradie
 Fix For: LCF Release 0.5

 Attachments: commandsPatch.patch


 At the moment LCF comes with some classes that can be used to run command 
 line to interact with the system. Examples are DBCreate, DBDrop and 
 LockClean. I wanted to create a class that rebuilds my complete environment. 
 So dropping a database, creating a database, cleaning the synch folder, 
 registering agents, etc. Due to the structure of the classes with all the 
 logic in the main method, I could not easily reuse these classes. In the 
 patch I submit with issue I have refactored the current solution in a better 
 reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-23 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891533#action_12891533
 ] 

Karl Wright commented on CONNECTORS-55:
---

MVCC is the feature that suggests greater concurrency (and, hence, greater 
performance).


 Bundle database server with LCF packaged product
 

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Sub-task
  Components: Installers
Reporter: Jack Krupansky

 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-76) Document Web Connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)

Document Web Connector configuration/specification API pieces
-

 Key: CONNECTORS-76
 URL: https://issues.apache.org/jira/browse/CONNECTORS-76
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document web connector - specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CONNECTORS-58) Mini-API to initially configure default connections and example jobs for file system and web crawl

2010-07-16 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright updated CONNECTORS-58:
--

Priority: Minor (was: Major)
Component/s: Examples
(was: Framework core)

I'm going to put this in a new category called examples.

Mini-API to initially configure default connections and example jobs for
file system and web crawl
-

Key: CONNECTORS-58
URL: https://issues.apache.org/jira/browse/CONNECTORS-58
Project: Lucene Connector Framework
Issue Type: Sub-task
Components: Examples
Reporter: Jack Krupansky
Priority: Minor

Creating a basic connection setup to do a relatively simple crawl for a file
system or web can be a daunting task for someone new to LCF. So, it would be
nice to have a scripting file that supports an abbreviated API (subset of the
full API discussed in CONNECTORS-56) sufficient to create a default set of
connections and example jobs that the new user can choose from.
Beyond this initial need, this script format might be a useful form to dump
all of the connections and jobs in the LCF database in a form that can be
used to recreate an LCF configuration. Kind of a dump and reload
capability. That in fact might be how the initial example script gets created.
Those are two distinct use cases, but could utilize the same feature.
The example script could have example jobs to crawl a subdirectory of LCF,
crawl the LCF wiki, etc.
There could be more than one script. There might be example scripts for each
form of connector.
This capability should be available for both QuickStart and the general
release of LCF.
As just one possibility, the script format might be a sequence of JSON
expressions, each with an initial string analogous to a servlet path to
specify the operation to be performed, followed by the JSON form of the
connection or job or other LCF object. Or, some other format might be more
suitable.
Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CONNECTORS-50) Proposal for initial two releases of LCF, including packaged product and full API

2010-07-16 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright updated CONNECTORS-50:
--

Component/s: (was: Framework core)

Moving this out of core, since it's a planning ticket not a software issue.

Proposal for initial two releases of LCF, including packaged product and full
API
-

Key: CONNECTORS-50
URL: https://issues.apache.org/jira/browse/CONNECTORS-50
Project: Lucene Connector Framework
Issue Type: New Feature
Reporter: Jack Krupansky

Currently, LCF has a relatively high-bar for evaluation and use, requiring
developer expertise. Also, although LCF has a comprehensive UI, it is not
currently packaged for use as a crawling engine for advanced applications.
A small set of individual feature requests are needed to address these
issues. They are summarized briefly to show how they fit together for two
initial releases of LCF, but will be broken out into individual LCF Jira
issues.
Goals:
1. LCF as a standalone, downloadable, usable-out-of-the-box product (much as
Solr is today)
2. LCF as a toolkit for developers needing customized crawling and repository
access
3. An API-based crawling engine that can be integrated with applications (as
Aperture is today)
Larger goals:
1. Make it very easy for users to evaluate LCF.
2. Make it very easy for developers to customize LCF.
3. Make it very easy for appplications to fully manage and control LCF in
operation.
Two phases:
1) Standalone, packaged app that is super-easy to evaluate and deploy. Call
it LCF 0.5.
2) API-based crawling engine for applications for which the UI might not be
appropriate. Call it LCF 1.0.
Phase 1
---
LCF 0.5 right out of the box would interface loosely with Solr 1.4 or later.
It would contain roughly the features that are currently in place or
currently underway, plus a little more.
Specifically, LCF 0.5 would contain these additional capabilities:
1. Plug-in architecture for connectors (CONNECTORS-40 - DONE)
2. Packaged app ready to run with embedded Jetty app server (CONNECTORS-59)
3. Bundled with database - PostgreSQL or derby - ready to run without
additional manual setup (CONNECTORS-55)
4. Mini-API to initially configure default connections and example jobs for
file system and web crawl (CONNECTORS-58)
5. Agent process started automatically (CONNECTORS-60)
6. Solr output connector option to commit at end of job, by default
(CONNECTORS-57)
Installation and basic evaluation of LCF would be essentially as simple as
Solr is today. The example
connections and jobs would permit the user to initiate example crawls of a
file system example
directory and an example web on the LCF web site with just a couple of clicks
(as opposed to the
detailed manual setup required today to create repository and output
connections and jobs.
It is worth considering whether the SharePoint connector could also be
included as part of the default package.
Users could then add additional connectors and repositories and jobs as
desired.
Timeframe for release? Level of effort?
Phase 2
---
The essence of Phase 2 is that LCF would be split to allow direct, full API
access to LCF as a
crawling engine, in additional to the full LCF UI. Call this LCF 1.0.
Specifically, LCF 1.0 would contain these additional capabilities:
1. Full API for LCF as a crawling engine (CONNECTORS-56)
2. LCF can be bundled within an app (CONNECTORS-61)
3. LCF event and activity notification for full control by an application
(CONNECTORS-41)
Overall, LCF will offer roughly the same crawling capabilities as with LCF
0.5, plus whatever bug
fixes and minor enhancements might also be added.
Timeframe for release? Level of effort?
-
Issues:
- Can we package PostgreSQL with LCF so LCF can set it up?
- Or do we need Derby for that purpose?
- Managing multiple processes (UI, database, agent, app processes)
- What exactly would the API look like? (URL, XML, JSON, YAML?)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CONNECTORS-60) Agent process should be started automatically

2010-07-14 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright updated CONNECTORS-60:
--

Priority: Minor (was: Major)
Description:
LCF as it exists today is a bit too complex to run for an average user,
especially with a separate agent process for crawling. LCF should be as easy to
run as Solr is today. QuickStart is a good move in this direction, but the same
user-visible simplicity is needed for full LCF. The separate agent process is a
reasonable design for execution, but a little too cumbersome for the average
user to manage.

Unfortunately, it is expected that starting up a multi-process application will
require platform-specific scripting.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

KDW - this functionality is already present; however the documentation is not
adequate to help people figure out how to do it. So I'm moving this to
Documentation and treating it as a doc bug.

was:
LCF as it exists today is a bit too complex to run for an average user,
especially with a separate agent process for crawling. LCF should be as easy to
run as Solr is today. QuickStart is a good move in this direction, but the same
user-visible simplicity is needed for full LCF. The separate agent process is a
reasonable design for execution, but a little too cumbersome for the average
user to manage.

Unfortunately, it is expected that starting up a multi-process application will
require platform-specific scripting.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

Component/s: Documentation
(was: Framework agents process)

Agent process should be started automatically
-

Key: CONNECTORS-60
URL: https://issues.apache.org/jira/browse/CONNECTORS-60
Project: Lucene Connector Framework
Issue Type: Sub-task
Components: Documentation
Reporter: Jack Krupansky
Priority: Minor

LCF as it exists today is a bit too complex to run for an average user,
especially with a separate agent process for crawling. LCF should be as easy
to run as Solr is today. QuickStart is a good move in this direction, but the
same user-visible simplicity is needed for full LCF. The separate agent
process is a reasonable design for execution, but a little too cumbersome for
the average user to manage.
Unfortunately, it is expected that starting up a multi-process application
will require platform-specific scripting.
Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.
KDW - this functionality is already present; however the documentation is not
adequate to help people figure out how to do it. So I'm moving this to
Documentation and treating it as a doc bug.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-59) Packaged app ready to run with embedded Jetty app server

2010-07-14 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-59?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-59.
---

Resolution: Fixed

I am unaware of any lingering issues with the QuickStart work.


 Packaged app ready to run with embedded Jetty app server 
 -

 Key: CONNECTORS-59
 URL: https://issues.apache.org/jira/browse/CONNECTORS-59
 Project: Lucene Connector Framework
  Issue Type: Sub-task
  Components: Framework core
Reporter: Jack Krupansky

 Many potential users of LCF are not necessarily sophisticated developers who 
 are prepared to work with code, but are able to install packaged software, 
 much as Solr is currently distributed. QuickStart for LCF is a good move in 
 this direction, but similar packaging is needed for full LCF with a 
 production database server. This issue focuses on assuring that full LCF is 
 released as a packaged app suitable for download and immediate use without 
 any additional software development expertise required.
 Database packaging has already been called out as a distinct issue 
 (CONNECTORS-55), so this issue is more of a catch-all for any lingering work 
 needed to address support for full LCF as a packaged app.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-61) Support bundling of LCF with an app

2010-07-13 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887806#action_12887806
 ] 

Karl Wright commented on CONNECTORS-61:
---

I'm tempted to close this issue because (a) there is absolutely no reason 
anyone competent cannot bundle lcf with an app today, and (b) it is completely 
unclear what, if anything, the 'fix' would look like.  A specific statement of 
an actual concrete problem is the only thing that will prevent me from closing 
this.


--- original message ---
From: ext Jack Krupansky (JIRA) j...@apache.org
Subject: [jira] Created: (CONNECTORS-61) Support bundling of LCF with an app
Date: July 12, 2010
Time: 2:48:11  PM


Support bundling of LCF with an app
---

 Key: CONNECTORS-61
 URL: https://issues.apache.org/jira/browse/CONNECTORS-61
 Project: Lucene Connector Framework
  Issue Type: Sub-task
  Components: Framework core
Reporter: Jack Krupansky


It should be possible for an application developer to bundle LCF with an 
application to facilitate installation and deployment of the application in 
conjunction with LCF. This may (or may not) be as simple as providing 
appropriate jar files and documentation for how to use them, but there may be 
other components or scripts needed.

There are two options: 1) include the LCF UI along with the other LCF 
processes, and 2) exclude the LCF UI and include only the other processes that 
can be controlled via the full API.

The database server would be included.

The web app server would be optional since the application may have its own 
choice of web app server.

One use case is bundling LCF with Solr or a Solr-based application.

Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




 Support bundling of LCF with an app
 ---

 Key: CONNECTORS-61
 URL: https://issues.apache.org/jira/browse/CONNECTORS-61
 Project: Lucene Connector Framework
  Issue Type: Sub-task
  Components: Framework core
Reporter: Jack Krupansky

 It should be possible for an application developer to bundle LCF with an 
 application to facilitate installation and deployment of the application in 
 conjunction with LCF. This may (or may not) be as simple as providing 
 appropriate jar files and documentation for how to use them, but there may be 
 other components or scripts needed.
 There are two options: 1) include the LCF UI along with the other LCF 
 processes, and 2) exclude the LCF UI and include only the other processes 
 that can be controlled via the full API.
 The database server would be included.
 The web app server would be optional since the application may have its own 
 choice of web app server.
 One use case is bundling LCF with Solr or a Solr-based application.
 Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12886717#action_12886717
 ] 

Karl Wright commented on CONNECTORS-55:
---

Mark,

If your concern is about installing LCF, read the Quick Start part of the 
build/deploy page.  You check out, build, and run.  Derby-based.  Nothing else 
to install.   Not hard really.



 Bundle database server with LCF packaged product
 

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky

 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12886722#action_12886722
 ] 

Karl Wright commented on CONNECTORS-55:
---


forcing the user to pick the right/acceptable release of PostgreSQL to install 
is error prone and a support headache


Yup.  It is.  Problem is that products/versions get security fixes, CVE's, 
end-of-life notices, etc.  It is beyond the scope of LCF to try and control all 
that - we'd be buying a whole new level of support headache, believe me.


 Bundle database server with LCF packaged product
 

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky

 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12886730#action_12886730
 ] 

Karl Wright commented on CONNECTORS-55:
---

The quick-start even takes care of connector registration for you, so 
executecommand is not needed even then.  What you *don't* get to do is use the 
command-based API to control LCF; that's not going to work in the 
single-process model.

By the way, hsqldb is apparently limited to a 16GB database (version 2.0).  
That's not very much.


 Bundle database server with LCF packaged product
 

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky

 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-38) There should be an LCF startup path that uses Jetty for running lcf-crawler-ui and lcf-authority-service

2010-07-06 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12885746#action_12885746
 ] 

Karl Wright commented on CONNECTORS-38:
---

Code complete.  There's now a dist/example directory, and you run lcf with the 
command java -jar start.jar from that directory, just like Solr.

Documentation needs updating, but otherwise this ticket is complete.


 There should be an LCF startup path that uses Jetty for running 
 lcf-crawler-ui and lcf-authority-service
 

 Key: CONNECTORS-38
 URL: https://issues.apache.org/jira/browse/CONNECTORS-38
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright

 Integrating with Jetty would allow LCF to be deployed in simple cases without 
 requiring Tomcat, which would simplify the setup in such cases.  This of 
 course should not be construed as removing the support for Tomcat-style web 
 applications.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-38) There should be an LCF startup path that uses Jetty for running lcf-crawler-ui and lcf-authority-service

2010-07-02 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12884651#action_12884651
]

Karl Wright commented on CONNECTORS-38:
---

I've started to look at what would be necessary to perform this work. If the
quick-start implementation will be using embedded derby, then it must run in
a single process (or derby is not happy at all). That would include the
crawler ui, the authority service, and the crawler daemon.

If jetty can be configured to run in such a way as to use system classes for
all of its web applications, then in theory it should be possible to put
together an LCF which, on startup, spawns the crawler daemon before starting up
jetty within the same process. For the classloader issue, there seems to be a
considerable degree of configuration flexibility, as described here:

http://docs.codehaus.org/display/JETTY/Classloading

The rest of the problem, i.e. starting and stopping jetty programmatically, may
be doable based on this page:

http://docs.codehaus.org/display/JETTY/Embedding+Jetty

However, (1) it's really not clear what model I should be using. I basically
need to be able to fire up two entire web applications, which don't need to be
in wars necessarily, but which certainly need to contain JSPs, .css files,
.jpg's, tld's, and other standard webish content. And (2), it's not clear
if/how you properly perform Jetty shutdown using the chosen model. Any advice
welcome.

There should be an LCF startup path that uses Jetty for running
lcf-crawler-ui and lcf-authority-service

Key: CONNECTORS-38
URL: https://issues.apache.org/jira/browse/CONNECTORS-38
Project: Lucene Connector Framework
Issue Type: Improvement
Components: Framework core
Reporter: Karl Wright

Integrating with Jetty would allow LCF to be deployed in simple cases without
requiring Tomcat, which would simplify the setup in such cases. This of
course should not be construed as removing the support for Tomcat-style web
applications.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-07-01 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12884285#action_12884285
]

Karl Wright commented on CONNECTORS-40:
---

Classloader has bee added, and the configuration file format is now XML. The
wiki connector description pages have been updated. Next:

- Change the build process and connector delivery model to take advantage of
the classloader
- Change the build process wiki document to reflect all changes

Classloader-based plug-in architecture would permit LCF to be prebuilt
--

Key: CONNECTORS-40
URL: https://issues.apache.org/jira/browse/CONNECTORS-40
Project: Lucene Connector Framework
Issue Type: Improvement
Components: Framework core
Reporter: Karl Wright
Assignee: Karl Wright

The LCF architecture at this point requires interaction with the build script
in order to add connectors. This is because the connector JSPs and jars need
to be added to the appropriate war files. However, there is another
architectural option that would eliminate this need, which is to use a custom
classloader to pull components from jars that are placed in a specific
directory or directories.
In order for this to work, however, the UI components of every connector must
become part of a jar. That implies that they will need to cease being JSPs,
and become instead methods of each connector class. (There is no
proscription against using something like Velocity for assembling the
necessary output for a connector, however.) Limiting the
backwards-compatibility impact of this change will be difficult, especially
after a first release is made, so it seems clear that any change along these
lines should be attempted before version 1.0 is released.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-07-01 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright resolved CONNECTORS-40.
---

Resolution: Fixed

All code committed. Related tickets (such as removing the need for
connector-specific -D switches) still in progress.

Classloader-based plug-in architecture would permit LCF to be prebuilt
--

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-47) Framework UI seems to call connector post processing more than needed

2010-06-30 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-47.
---

  Assignee: Karl Wright
Resolution: Fixed

r959393.  Refactor as needed to solidify the contract between edit pages and 
the execute.jsp post page.


 Framework UI seems to call connector post processing more than needed
 -

 Key: CONNECTORS-47
 URL: https://issues.apache.org/jira/browse/CONNECTORS-47
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Framework crawler agent
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor

 Connector form post processing is currently invoked both in execute.jsp 
 (which is the target of all form posts), as well as in individual edit pages 
 (such as editconfig.jsp and editjob.jsp).  Unless a reason can be found for 
 why this is done, the individual edit page calls should be removed, since 
 they are by definition superfluous.
 Possible reasons it was done this way were:
 (a) that code predates execute.jsp
 (b) some other functionality, e.g. copy or posting of certificates, needs it
 At any rate, this should be looked at after the bulk of CONNECTORS-40 related 
 changes are committed to trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-51) Reduce the number of required -D defines by using System.setProperty() in the appropriate places

2010-06-30 Thread Karl Wright (JIRA)

Reduce the number of required -D defines by using System.setProperty() in the 
appropriate places


 Key: CONNECTORS-51
 URL: https://issues.apache.org/jira/browse/CONNECTORS-51
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: JCIFS connector
Reporter: Karl Wright
Priority: Minor


The JCIFS connector requires a fair number of -D switches in the java startup 
in order to do the right things.  This is largely because jcifs.jar is 
constructed this way.  It may be possible, however, to eliminate these -D's by 
judicious static use of System.setProperty() within the appropriate connector 
class, provided we presume that jcifs classes will never be loaded prior to the 
jcifs connector classes being loaded.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-06-30 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright reassigned CONNECTORS-40:
-

Assignee: Karl Wright

Classloader-based plug-in architecture would permit LCF to be prebuilt
--

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-06-29 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883595#action_12883595
 ] 

Karl Wright commented on CONNECTORS-40:
---

The UI changes have been made, largely hand-tested, and merged into trunk.  
Next steps for this ticket include:

- Updating the wiki page on how to build a connector
- Writing the classloader implementation that will actually allow for plugin 
loading


 Classloader-based plug-in architecture would permit LCF to be prebuilt
 --

 Key: CONNECTORS-40
 URL: https://issues.apache.org/jira/browse/CONNECTORS-40
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright

 The LCF architecture at this point requires interaction with the build script 
 in order to add connectors.  This is because the connector JSPs and jars need 
 to be added to the appropriate war files.  However, there is another 
 architectural option that would eliminate this need, which is to use a custom 
 classloader to pull components from jars that are placed in a specific 
 directory or directories.
 In order for this to work, however, the UI components of every connector must 
 become part of a jar.  That implies that they will need to cease being JSPs, 
 and become instead methods of each connector class.  (There is no 
 proscription against using something like Velocity for assembling the 
 necessary output for a connector, however.)  Limiting the 
 backwards-compatibility impact of this change will be difficult, especially 
 after a first release is made, so it seems clear that any change along these 
 lines should be attempted before version 1.0 is released.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-49) Solr connector metadata and id field can collide, causing multiple id fields to be passed in

2010-06-29 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-49.
---

Resolution: Fixed

r959167.  Tested, except in the context of an actual crawl.


 Solr connector metadata and id field can collide, causing multiple id fields 
 to be passed in
 

 Key: CONNECTORS-49
 URL: https://issues.apache.org/jira/browse/CONNECTORS-49
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Lucene/SOLR connector
Reporter: Karl Wright
Assignee: Karl Wright

 If a document has a metadata field called id, or ID, or Id, or any such 
 thing, the Solr connector will blithely send both the document id and the 
 metadata id along to Solr, which will then crap out with an error.  The 
 solution is to map the metadata id field to something else, which should be 
 determined by the solr connection definition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-49) Solr connector metadata and id field can collide, causing multiple id fields to be passed in

2010-06-23 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881604#action_12881604
 ] 

Karl Wright commented on CONNECTORS-49:
---

As per discussions in connectors-user, it's probably important to also provide 
a declaration of the name of the solr id field in the configuration, with a 
default value of id.  Longer term, maybe Solr can learn to accept a generic 
notion of primary key, but that's as yet undecided.



 Solr connector metadata and id field can collide, causing multiple id fields 
 to be passed in
 

 Key: CONNECTORS-49
 URL: https://issues.apache.org/jira/browse/CONNECTORS-49
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Lucene/SOLR connector
Reporter: Karl Wright
Assignee: Karl Wright

 If a document has a metadata field called id, or ID, or Id, or any such 
 thing, the Solr connector will blithely send both the document id and the 
 metadata id along to Solr, which will then crap out with an error.  The 
 solution is to map the metadata id field to something else, which should be 
 determined by the solr connection definition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-49) Solr connector metadata and id field can collide, causing multiple id fields to be passed in

2010-06-22 Thread Karl Wright (JIRA)

Solr connector metadata and id field can collide, causing multiple id fields to 
be passed in


 Key: CONNECTORS-49
 URL: https://issues.apache.org/jira/browse/CONNECTORS-49
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Lucene/SOLR connector
Reporter: Karl Wright


If a document has a metadata field called id, or ID, or Id, or any such 
thing, the Solr connector will blithely send both the document id and the 
metadata id along to Solr, which will then crap out with an error.  The 
solution is to map the metadata id field to something else, which should be 
determined by the solr connection definition.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-49) Solr connector metadata and id field can collide, causing multiple id fields to be passed in

2010-06-22 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-49:
-

Assignee: Karl Wright

 Solr connector metadata and id field can collide, causing multiple id fields 
 to be passed in
 

 Key: CONNECTORS-49
 URL: https://issues.apache.org/jira/browse/CONNECTORS-49
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Lucene/SOLR connector
Reporter: Karl Wright
Assignee: Karl Wright

 If a document has a metadata field called id, or ID, or Id, or any such 
 thing, the Solr connector will blithely send both the document id and the 
 metadata id along to Solr, which will then crap out with an error.  The 
 solution is to map the metadata id field to something else, which should be 
 determined by the solr connection definition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-48) SharePoint rules description is incomplete

2010-06-18 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-48.
---

  Assignee: Karl Wright
Resolution: Fixed

Added a section on rule matching and implied rules - hope this helps.


 SharePoint rules description is incomplete
 --

 Key: CONNECTORS-48
 URL: https://issues.apache.org/jira/browse/CONNECTORS-48
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Assignee: Karl Wright

 The description of how SharePoint inclusion and exclusion rules work is 
 inadequate for an end user to be able to use the connector effectively.  
 Specifically, it does not explain how the connector matches a rule.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-47) Framework UI seems to call connector post processing more than needed

2010-06-17 Thread Karl Wright (JIRA)

Framework UI seems to call connector post processing more than needed
-

 Key: CONNECTORS-47
 URL: https://issues.apache.org/jira/browse/CONNECTORS-47
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Framework crawler agent
Reporter: Karl Wright
Priority: Minor


Connector form post processing is currently invoked both in execute.jsp (which 
is the target of all form posts), as well as in individual edit pages (such as 
editconfig.jsp and editjob.jsp).  Unless a reason can be found for why this is 
done, the individual edit page calls should be removed, since they are by 
definition superfluous.

Possible reasons it was done this way were:

(a) that code predates execute.jsp
(b) some other functionality, e.g. copy or posting of certificates, needs it

At any rate, this should be looked at after the bulk of CONNECTORS-40 related 
changes are committed to trunk.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-45) Solr connector gives no way to specify the solr core name

2010-06-16 Thread Karl Wright (JIRA)

Solr connector gives no way to specify the solr core name
-

 Key: CONNECTORS-45
 URL: https://issues.apache.org/jira/browse/CONNECTORS-45
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Lucene/SOLR connector
Reporter: Karl Wright


The Solr Connector allows you to specify everything about the Solr connection 
except the Solr Core name.  A new configuration field should be added, which is 
optional and defaults to blank, to allow this field to be set.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-06-15 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12879215#action_12879215
]

Karl Wright commented on CONNECTORS-40:
---

The implementation strategy is as follows:

(1) Add methods to the connector interfaces to support the UI. These
correspond directly to the chunks of UI contributed by each connector that used
to be performed by jsps, which used to be located by a naming technique.
(Every connector had a family of jsps, e.g.
output/connector_name/headerconfig.jsp,
output/connector_name/editconfig.jsp, etc.) To do this in a way that will
make it possible to easily replace the technology for the framework side of the
UI later, I also introduced some interfaces so that there are no direct
references to any JSP or servlet classes.

(2) Change the framework UI to call the connector methods rather than the old
jsp components.

(3) Change all individual connectors to discard their JSPs and instead
implement the connector methods.

Once this preliminary work is done, it should be possible to write a class
loader to allow a user (or an installer) to specify a set of paths in which to
search for jars. This would make it possible for people to deliver connectors
into the system without having to rebuild the war file, which currently is
necessary. That, in turn, makes it feasible to prebuild all LCF components and
deliver it much like Solr is delivered.

The CONNECTORS-40 branch currently contains just the following:
- UI method additions to the output connection interface only;
- Changes to the framework UI code to call the new methods;
- Changes to the GTS output connector to implement the new methods (and remove
the old JSPs).

The reason this has been checked in at this point is largely as a sanity check.
It's a lot easier to change direction when one connector has been done than it
would be to change 15 of them.

Hope this helps.

Classloader-based plug-in architecture would permit LCF to be prebuilt
--

Key: CONNECTORS-40
URL: https://issues.apache.org/jira/browse/CONNECTORS-40
Project: Lucene Connector Framework
Issue Type: Improvement
Components: Framework core
Reporter: Karl Wright

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-34) eRoom authority and connector

2010-06-10 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877408#action_12877408
 ] 

Karl Wright commented on CONNECTORS-34:
---

It turns out that EMC has released a new version of eRoom that uses Documentum 
as an implementation platform.  This would imply that no connector needs to be 
developed, except perhaps to support legacy eRoom installations.  Can anyone 
confirm this story?


 eRoom authority and connector
 -

 Key: CONNECTORS-34
 URL: https://issues.apache.org/jira/browse/CONNECTORS-34
 Project: Lucene Connector Framework
  Issue Type: New Feature
Reporter: Karl Wright

 eRoom has a SOAP API which looks like it has enough power to perhaps 
 implement a connector and an authority.  The eRoom API url is here (and yes, 
 it is a chinese url, but is legit):
 https://eroom.abraxas.ch/eroomHelp/en/API_Help/Api.htm#home_api.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-44) Adding metadata support to JDBC connector

2010-06-10 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-44?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877409#action_12877409
 ] 

Karl Wright commented on CONNECTORS-44:
---

I think this feature has merit in its own right.  I'm a little leery about this 
becoming a Stellent connector, though, since:
(a) it's hardly end-user friendly for users to have to learn the Stellent 
schema;
(b) I'm sure Stellent has some kind of security, and this proposal would not 
address that.


 Adding metadata support to JDBC connector
 -

 Key: CONNECTORS-44
 URL: https://issues.apache.org/jira/browse/CONNECTORS-44
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: JDBC connector
 Environment: Windows, Oracle 10g, Oracle Universal Content Management 
 System
Reporter: Rohan G Patil
Priority: Critical
   Original Estimate: 0.02h
  Remaining Estimate: 0.02h

 The metadata for the documents checked in is stored in different fields of 
 the Database. for example created date, Author,Title etc.
 The BLOB object contains only the text of the document. It would be very 
 helpful if we could add support select Metadata fields (Columns in the 
 database ) while querying the table.
 The above support would be helpful and make it a substitute for Oracle UCM 
 (Stellent) Connector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CONNECTORS-44) Adding metadata support to JDBC connector

2010-06-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-44?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-44:
--

 Original Estimate: 48h  (was: 0.02h)
Remaining Estimate: 48h  (was: 0.02h)
  Assignee: Karl Wright
  Priority: Major  (was: Critical)

 Adding metadata support to JDBC connector
 -

 Key: CONNECTORS-44
 URL: https://issues.apache.org/jira/browse/CONNECTORS-44
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: JDBC connector
 Environment: Windows, Oracle 10g, Oracle Universal Content Management 
 System
Reporter: Rohan G Patil
Assignee: Karl Wright
   Original Estimate: 48h
  Remaining Estimate: 48h

 The metadata for the documents checked in is stored in different fields of 
 the Database. for example created date, Author,Title etc.
 The BLOB object contains only the text of the document. It would be very 
 helpful if we could add support select Metadata fields (Columns in the 
 database ) while querying the table.
 The above support would be helpful and make it a substitute for Oracle UCM 
 (Stellent) Connector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-44) Adding metadata support to JDBC connector

2010-06-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-44?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-44.
---

Resolution: Fixed

Committed fix in svn revision 953386.


 Adding metadata support to JDBC connector
 -

 Key: CONNECTORS-44
 URL: https://issues.apache.org/jira/browse/CONNECTORS-44
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: JDBC connector
 Environment: Windows, Oracle 10g, Oracle Universal Content Management 
 System
Reporter: Rohan G Patil
Assignee: Karl Wright
   Original Estimate: 48h
  Remaining Estimate: 48h

 The metadata for the documents checked in is stored in different fields of 
 the Database. for example created date, Author,Title etc.
 The BLOB object contains only the text of the document. It would be very 
 helpful if we could add support select Metadata fields (Columns in the 
 database ) while querying the table.
 The above support would be helpful and make it a substitute for Oracle UCM 
 (Stellent) Connector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-43) Useless call to String.trim() in org.apache.lcf.ui.util.MultilineParser

2010-06-09 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-43?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-43:
-

Assignee: Karl Wright

 Useless call to String.trim() in org.apache.lcf.ui.util.MultilineParser
 ---

 Key: CONNECTORS-43
 URL: https://issues.apache.org/jira/browse/CONNECTORS-43
 Project: Lucene Connector Framework
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Karl Wright
Priority: Trivial

 {code}
 nextString.trim();
 {code}
 should likely be:
 {code}
 nextString = nextString.trim();
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-06-02 Thread Karl Wright (JIRA)

Classloader-based plug-in architecture would permit LCF to be prebuilt
--

 Key: CONNECTORS-40
 URL: https://issues.apache.org/jira/browse/CONNECTORS-40
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright


The LCF architecture at this point requires interaction with the build script 
in order to add connectors.  This is because the connector JSPs and jars need 
to be added to the appropriate war files.  However, there is another 
architectural option that would eliminate this need, which is to use a custom 
classloader to pull components from jars that are placed in a specific 
directory or directories.

In order for this to work, however, the UI components of every connector must 
become part of a jar.  That implies that they will need to cease being JSPs, 
and become instead methods of each connector class.  (There is no proscription 
against using something like Velocity for assembling the necessary output for a 
connector, however.)  Limiting the backwards-compatibility impact of this 
change will be difficult, especially after a first release is made, so it seems 
clear that any change along these lines should be attempted before version 1.0 
is released.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-41) Add hooks to output connectors for receiving event notifications, specifically job start, job end, etc.

2010-06-02 Thread Karl Wright (JIRA)

Add hooks to output connectors for receiving event notifications, specifically 
job start, job end, etc.
---

 Key: CONNECTORS-41
 URL: https://issues.apache.org/jira/browse/CONNECTORS-41
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright
Priority: Minor


Currently there is no logic that informs an output connection of a job start, 
end, deletion, or other activity.  While this would seem to have little to do 
with an output connector, this feature has been requested by Jack Krupansky as 
a potential way of deciding when to tell Solr to commit documents, rather than 
leave it up to Solr's configuration.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-39) Database abstraction layer does not abstract from transactions

2010-06-02 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-39.
---

Resolution: Fixed

 Database abstraction layer does not abstract from transactions
 --

 Key: CONNECTORS-39
 URL: https://issues.apache.org/jira/browse/CONNECTORS-39
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright
Assignee: Karl Wright

 The database abstraction layer in LCF does not permit someone to abstract 
 from transaction management.  That responsibility is delegated to a different 
 class, which presumes that transaction management is not database-type 
 dependent.  Unfortunately, this is not the case.
 A better code structure would involve creating an abstract base class that 
 performed the transaction management and caching, and causing all database 
 implementations to be derived from it.  Then, abstract methods for 
 transaction begin and end could be readily defined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-35) Need a way to reset LCF when external conditions change

2010-06-02 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-35?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright resolved CONNECTORS-35.
---

Resolution: Fixed

Have decided that the current functionality is adequate, and no further work
needs to be done.

Need a way to reset LCF when external conditions change
---

Key: CONNECTORS-35
URL: https://issues.apache.org/jira/browse/CONNECTORS-35
Project: Lucene Connector Framework
Issue Type: Improvement
Components: Framework agents process, Framework core, Framework
crawler agent
Reporter: Karl Wright
Assignee: Karl Wright

When a change is made external to LCF, such as a Solr configuration change,
LCF needs some way for a user to signal that that change took place. For
example, a button or link on the view output connection page might signal
some undefined global change in the target of that output connection. A
similar button or link on the repository connection view page might signal a
corresponding change to the underlying repository.
Clicking the button would do the following things:
(1) It would clear the current version string for all documents that passed
through that connection. This would guarantee that the documents would be
reingested if and when they were processed the next time.
(2) It would reset the last job time value for all jobs affected by the
connection to zero. This would guarantee that all documents belonging to
that job would be rechecked.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-37) LCF should use an XML configuration file, not the simple name/value config file it currently has

2010-06-01 Thread Karl Wright (JIRA)

LCF should use an XML configuration file, not the simple name/value config file 
it currently has


 Key: CONNECTORS-37
 URL: https://issues.apache.org/jira/browse/CONNECTORS-37
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright


LCF's configuration file is limited in what it can specify, and XML 
configuration files seem to offer more flexibility and are the modern norm.  
Before backwards compatibility becomes an issue, it may therefore be worth 
converting the property file reader to use XML rather than name/value format.  
It would also be nice to be able to fold the logging configuration into the 
same file, if this seems possible.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-37) LCF should use an XML configuration file, not the simple name/value config file it currently has

2010-06-01 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874000#action_12874000
]

Karl Wright commented on CONNECTORS-37:
---

Which comes first, chicken or egg? The current properties file specifies quite
a bit of stuff about database implementation and access, so obviously that
can't go into the database. Also, the pointer to the logging configuration
file, and any other file pointers, probably should stay out of the database,
since these tend to be local instance configuration rather than global
configuration. While I'm sure that there are still *some* configuration
parameters that are legitimately global in nature, most of the serious
configuration (like connections, authorities, jobs, etc.) are already in the
database.

So maybe this ticket should read, ... excluding all global configuration
information, which should be moved to a database table...

The driver behind this ticket, FWIW, is a complaint that configuring LCF
requires repeated user interaction with the database - and that user prefers
solr-style XML config files instead. I don't necessarily buy that view, but
using XML instead of name/value pairs seemed like a wise precaution. ;-)

LCF should use an XML configuration file, not the simple name/value config
file it currently has

Key: CONNECTORS-37
URL: https://issues.apache.org/jira/browse/CONNECTORS-37
Project: Lucene Connector Framework
Issue Type: Improvement
Components: Framework core
Reporter: Karl Wright

LCF's configuration file is limited in what it can specify, and XML
configuration files seem to offer more flexibility and are the modern norm.
Before backwards compatibility becomes an issue, it may therefore be worth
converting the property file reader to use XML rather than name/value format.
It would also be nice to be able to fold the logging configuration into the
same file, if this seems possible.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-37) LCF should use an XML configuration file, not the simple name/value config file it currently has

2010-06-01 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874035#action_12874035
]

Karl Wright commented on CONNECTORS-37:
---

I am not happy with the idea of configuration living in both the database and
in an XML file. The idea that you can somehow read the XML configuration just
once the first time LCF is started seems rife with potential problems. Far
from improving the user experience, I think that the proposed design would
instead create enormous confusion.

Perhaps the problem is that Mr. Krupansky is attempting to do too much with a
single configuration file here. It would be perfectly reasonable to introduce
a read setup information command that would read what is effectively a
sequence of commands from an XML file. However, that command file would be an
execute once kind of affair - although it could be coded in such a way as to
ignore the definition of entities that already exist in the database.
Nevertheless, such a file would have a very different usage pattern than the
configuration file as it exists today, so I'd have a lot of concern using the
same configuration file for both purposes.

LCF should use an XML configuration file, not the simple name/value config
file it currently has

Key: CONNECTORS-37
URL: https://issues.apache.org/jira/browse/CONNECTORS-37
Project: Lucene Connector Framework
Issue Type: Improvement
Components: Framework core
Reporter: Karl Wright

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-39) Database abstraction layer does not abstract from transactions

2010-06-01 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-39:
-

Assignee: Karl Wright

 Database abstraction layer does not abstract from transactions
 --

 Key: CONNECTORS-39
 URL: https://issues.apache.org/jira/browse/CONNECTORS-39
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright
Assignee: Karl Wright

 The database abstraction layer in LCF does not permit someone to abstract 
 from transaction management.  That responsibility is delegated to a different 
 class, which presumes that transaction management is not database-type 
 dependent.  Unfortunately, this is not the case.
 A better code structure would involve creating an abstract base class that 
 performed the transaction management and caching, and causing all database 
 implementations to be derived from it.  Then, abstract methods for 
 transaction begin and end could be readily defined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-39) Database abstraction layer does not abstract from transactions

2010-06-01 Thread Karl Wright (JIRA)

Database abstraction layer does not abstract from transactions
--

 Key: CONNECTORS-39
 URL: https://issues.apache.org/jira/browse/CONNECTORS-39
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright


The database abstraction layer in LCF does not permit someone to abstract from 
transaction management.  That responsibility is delegated to a different class, 
which presumes that transaction management is not database-type dependent.  
Unfortunately, this is not the case.

A better code structure would involve creating an abstract base class that 
performed the transaction management and caching, and causing all database 
implementations to be derived from it.  Then, abstract methods for transaction 
begin and end could be readily defined.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-36) The Solr connector's UI method of handling arguments is limited and non-intuitive

2010-05-19 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-36?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-36.
---

  Assignee: Karl Wright
Resolution: Fixed

Revised UI as stipulated.

r946090.



 The Solr connector's UI method of handling arguments is limited and 
 non-intuitive
 -

 Key: CONNECTORS-36
 URL: https://issues.apache.org/jira/browse/CONNECTORS-36
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor

 The arguments are currently ordered by name, and are stored in a simple hash, 
 meaning that they cannot be multivalued.  Furthermore you cannot edit 
 arguments; you can only delete and replace them.  It would be better if:
 - Argument names were ordered, but values appeared in the order they were 
 entered.
 - Each argument value appeared in a text box, so it could be edited directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-35) Need a way to reset LCF when external conditions change

2010-05-14 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867513#action_12867513
]

Karl Wright commented on CONNECTORS-35:
---

Added the ability to perform this reset from the view output connection
screen. Still not sure if we really need a repository-connection equivalent;
that's in any case much harder, because the ingeststatus table has no column at
this time containing the repository connection name by itself.

r944298

Need a way to reset LCF when external conditions change
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-35) Need a way to reset LCF when external conditions change

2010-05-13 Thread Karl Wright (JIRA)

Need a way to reset LCF when external conditions change
---

 Key: CONNECTORS-35
 URL: https://issues.apache.org/jira/browse/CONNECTORS-35
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework agents process, Framework core, Framework 
crawler agent
Reporter: Karl Wright


When a change is made external to LCF, such as a Solr configuration change, LCF 
needs some way for a user to signal that that change took place.  For example, 
a button or link on the view output connection page might signal some 
undefined global change in the target of that output connection.  A similar 
button or link on the repository connection view page might signal a 
corresponding change to the underlying repository.

Clicking the button would do the following things:

(1) It would clear the current version string for all documents that passed 
through that connection.  This would guarantee that the documents would be 
reingested if and when they were processed the next time.

(2) It would reset the last job time value for all jobs affected by the 
connection to zero.  This would guarantee that all documents belonging to that 
job would be rechecked.





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-35) Need a way to reset LCF when external conditions change

2010-05-13 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-35?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright reassigned CONNECTORS-35:
-

Assignee: Karl Wright

Need a way to reset LCF when external conditions change
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-33) Need a wiki page for people who want to operate LCF programmatically

2010-05-12 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-33.
---

Resolution: Fixed

Here's the page:

https://cwiki.apache.org/confluence/display/CONNECTORS/Programmatic+Operation+of+LCF


 Need a wiki page for people who want to operate LCF programmatically
 

 Key: CONNECTORS-33
 URL: https://issues.apache.org/jira/browse/CONNECTORS-33
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright

 The necessary commands are present, but we still need a wiki page to document 
 how to manipulate LCF programmatically.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-34) eRoom authority and connector

2010-05-10 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12865868#action_12865868
 ] 

Karl Wright commented on CONNECTORS-34:
---

.ch/.cn - so close. ;-)

 eRoom authority and connector
 -

 Key: CONNECTORS-34
 URL: https://issues.apache.org/jira/browse/CONNECTORS-34
 Project: Lucene Connector Framework
  Issue Type: New Feature
Reporter: Karl Wright

 eRoom has a SOAP API which looks like it has enough power to perhaps 
 implement a connector and an authority.  The eRoom API url is here (and yes, 
 it is a chinese url, but is legit):
 https://eroom.abraxas.ch/eroomHelp/en/API_Help/Api.htm#home_api.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-33) Need a wiki page for people who want to operate LCF programmatically

2010-04-30 Thread Karl Wright (JIRA)

Need a wiki page for people who want to operate LCF programmatically


 Key: CONNECTORS-33
 URL: https://issues.apache.org/jira/browse/CONNECTORS-33
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright


The necessary commands are present, but we still need a wiki page to document 
how to manipulate LCF programmatically.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-29) Credentials are not properly encoded when sent to JCIFS, making passwords with %'s or #'s not work properly

2010-04-29 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-29?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-29:
-

Assignee: Karl Wright

 Credentials are not properly encoded when sent to JCIFS, making passwords 
 with %'s or #'s not work properly
 ---

 Key: CONNECTORS-29
 URL: https://issues.apache.org/jira/browse/CONNECTORS-29
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: JCIFS connector
Reporter: Karl Wright
Assignee: Karl Wright

 The credentials assembled by the JCIFS connector do not properly encode 
 usernames, passwords using %-encoding as JCIFS expects.  This leads to 
 passwords with %'s or #'s in them not working properly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-21) Authority service needed that knows how to obtain SIDs from a Kerberos principal

2010-04-29 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-21.
---

Resolution: Fixed

I created a Java authority instead, using JNDI, so this is moot.


 Authority service needed that knows how to obtain SIDs from a Kerberos 
 principal
 

 Key: CONNECTORS-21
 URL: https://issues.apache.org/jira/browse/CONNECTORS-21
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Mod-authz-annotate
Reporter: Karl Wright

 The code that was granted to Apache from MetaCarta intentionally did not 
 include an authority service that knows how to obtain SIDs from a Kerberos 
 principal.  This will invalidate the security enforcement for the FileNet, 
 Meridio, and SharePoint connectors, since these use AD as their primary 
 security model.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-31) For the Solr LCF security filter plugin, establish a concept of session to improve performance

2010-04-29 Thread Karl Wright (JIRA)

For the Solr LCF security filter plugin, establish a concept of session to 
improve performance
--

 Key: CONNECTORS-31
 URL: https://issues.apache.org/jira/browse/CONNECTORS-31
 Project: Lucene Connector Framework
  Issue Type: Improvement
Reporter: Karl Wright


Instead of only allowing an authenticated user name to be passed to the 
LCFSecurityFilter SearchComponent, improve this to return a security token and 
optionally receive the security token as well.  Then it will be possible for it 
to make the access tokens sticky, reducing load on the authority service on 
situations where multiple searches occur in each session.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-27) Add support for observation to the crawler agent

2010-04-16 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857861#action_12857861
 ] 

Karl Wright commented on CONNECTORS-27:
---

I understand what your proposed infrastructure does.  What I don't understand 
is the use case.  It seems to me like all you are doing is adding a poll method 
to a repository connector.  But there already is one.  Can you provide a case 
which demonstrates the need for this infrastructure?


 Add support for observation to the crawler agent
 

 Key: CONNECTORS-27
 URL: https://issues.apache.org/jira/browse/CONNECTORS-27
 Project: Lucene Connector Framework
  Issue Type: New Feature
  Components: Framework crawler agent
Reporter: Ralph Benjamin Ruijs
Priority: Minor
 Attachments: Added_observation_logic_to_the_crawler.patch


 When crawling a large repository, it could take a lot of time before changes 
 are propagated to Solr. You can add an event listener to the repository, and 
 be notified about changes. The crawler will ensure you have a complete copy 
 in case of missed events.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (CONNECTORS-16) JCIFS connector's document fingerprinting feature is not general enough

2010-03-18 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright reassigned CONNECTORS-16:
-

Assignee: Karl Wright

JCIFS connector's document fingerprinting feature is not general enough
---

Key: CONNECTORS-16
URL: https://issues.apache.org/jira/browse/CONNECTORS-16
Project: Lucene Connector Framework
Issue Type: Improvement
Components: Framework agents process, Framework crawler agent, GTS
connector, JCIFS connector, LiveLink connector, Lucene/SOLR connector,
Meridio connector, RSS connector, SharePoint connector, Web connector
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor

The JCIFS connector has a feature, called fingerprinting, which allows it
to classify documents according to ability of the back-end to index that
content. Right at the moment, this fingerprinter is capable of recognizing
PDFs, Microsoft Office files, and text files as being indexable. One could
imagine, though, that different SOLR plugins, etc. might have more capability
than that. Also, other connectors could potentially benefit from similar
technology, specifically any connector that deals with binary documents.
One approach to solving this problem would be to remove the feature entirely,
and allow whatever pipeline exists in SOLR determine the indexability after
the fact. The reason this feature was added at MetaCarta, however, is that
it may be possible to exclude an un-useful document without having to fetch
the whole thing, and (at least for MetaCarta clients) the number of
unindexable files of gigantic size was a big concern.
Another approach might be to tie the functionality in with the output
connector interface, so that an output connector would (somehow) determine
applicability of a document. This would require some care to make it
possible to fingerprint without having to download the entire document, but
would otherwise have the correct overall structure.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-24) SOLR connector needs the ability to ingest metadata

2010-03-18 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-24.
---

Resolution: Fixed

Oops, I'd forgotten that this was actually already done.


 SOLR connector needs the ability to ingest metadata
 ---

 Key: CONNECTORS-24
 URL: https://issues.apache.org/jira/browse/CONNECTORS-24
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Reporter: Karl Wright

 The SOLR connector is pretty bare-bones at the moment, and even lacks the 
 ability to transmit metadata to SOLR.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CONNECTORS-23) Command documentation could benefit from usage information

2010-03-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-23?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-23:
-

Assignee: Karl Wright

 Command documentation could benefit from usage information
 --

 Key: CONNECTORS-23
 URL: https://issues.apache.org/jira/browse/CONNECTORS-23
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Damien Mabin
Assignee: Karl Wright
Priority: Minor

 It's about the page : 
 [Build  
 Deploy|http://cwiki.apache.org/confluence/display/CONNECTORS/How+to+Build+and+Deploy+Lucene+Connectors+Framework]
 In the paragraph about Commands, each commands should be associate with an 
 example of use, something like that :
 ||Core Command Class||Function||
 |org.apache.lcf.core.DBCreate|Create LCF database instance
 eg : java org.apache.lcf.core.DBCreate UserName Password|
 |org.apache.lcf.core.DBDrop|Drop LCF database instance
 eg : java org.apache.lcf.core.DBDrop|

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CONNECTORS-16) JCIFS connector's document fingerprinting feature is not general enough

2010-03-04 Thread Karl Wright (JIRA)

JCIFS connector's document fingerprinting feature is not general enough
---

 Key: CONNECTORS-16
 URL: https://issues.apache.org/jira/browse/CONNECTORS-16
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework agents process, Framework crawler agent, GTS 
connector, JCIFS connector, LiveLink connector, Lucene/SOLR connector, Meridio 
connector, RSS connector, SharePoint connector, Web connector
Reporter: Karl Wright
Priority: Minor


The JCIFS connector has a feature, called fingerprinting, which allows it to 
classify documents according to ability of the back-end to index that content.  
Right at the moment, this fingerprinter is capable of recognizing PDFs, 
Microsoft Office files, and text files as being indexable.  One could imagine, 
though, that different SOLR plugins, etc. might have more capability than that. 
 Also, other connectors could potentially benefit from similar technology, 
specifically any connector that deals with binary documents.

One approach to solving this problem would be to remove the feature entirely, 
and allow whatever pipeline exists in SOLR determine the indexability after the 
fact.  The reason this feature was added at MetaCarta, however, is that it may 
be possible to exclude an un-useful document without having to fetch the whole 
thing, and (at least for MetaCarta clients) the number of unindexable files of 
gigantic size was a big concern.

Another approach might be to tie the functionality in with the output connector 
interface, so that an output connector would (somehow) determine applicability 
of a document.  This would require some care to make it possible to fingerprint 
without having to download the entire document, but would otherwise have the 
correct overall structure.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CONNECTORS-15) Documentum Connector testing code references a not-present class

2010-03-02 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-15:
--

Component/s: Documentum connector

 Documentum Connector testing code references a not-present class
 

 Key: CONNECTORS-15
 URL: https://issues.apache.org/jira/browse/CONNECTORS-15
 Project: Lucene Connector Framework
  Issue Type: Test
  Components: Documentum connector
Reporter: Karl Wright

 The documentum connector Java testing code references a class from 
 TrinityTechnologies, which was not granted.  This class reference should be 
 removed and replaced by direct references to the appropriate DFC methods.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CONNECTORS-4) Submit other package changes supplied with software grant upstream to the proper projects

2010-03-02 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-4:
-

Component/s: LiveLink connector

 Submit other package changes supplied with software grant upstream to the 
 proper projects
 -

 Key: CONNECTORS-4
 URL: https://issues.apache.org/jira/browse/CONNECTORS-4
 Project: Lucene Connector Framework
  Issue Type: Task
  Components: Framework agents process, Framework crawler agent, 
 LiveLink connector, Meridio connector, RSS connector, SharePoint connector, 
 Web connector
Reporter: Karl Wright
Assignee: Karl Wright

 The code granted by MetaCarta depends on certain specific feature additions 
 and changes MetaCarta made to some packages it depends upon, specifically 
 jCIFS, commons-httpclient, and xerces-j.  These changes should be percolated 
 accordingly.  They can be found in the tarball under the directory 
 upstream-diffs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CONNECTORS-4) Submit other package changes supplied with software grant upstream to the proper projects

2010-02-24 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837779#action_12837779
]

Karl Wright commented on CONNECTORS-4:
--

HttpClient team wants us to upgrade to their latest release, which is 4.1.
They claim this fixes 2 of the 3 patches I submitted. For the record, the
patches were submitted under tickets:

HTTPCLIENT-917
HTTPCLIENT-918
HTTPCLIENT-919

The one they rejected outright was ticket HTTPCLIENT-919, for reasons that they
believed it violated Apache policy as pertaining to potential IP infringement,
specifically because NTLM is a proprietary authentication and authorization
scheme. There was no indication that they were aware of any specific patent
issues, but that apparently is not the key point.

If this reasoning stands, I intend to create two additional tickets - one for
moving to HttpClient 4.1, and one for modifying the build scripts to obtain an
appropriate NTLM implementation from some non-Apache open-source project.

Submit other package changes supplied with software grant upstream to the
proper projects
-

Key: CONNECTORS-4
URL: https://issues.apache.org/jira/browse/CONNECTORS-4
Project: Lucene Connector Framework
Issue Type: Task
Reporter: Karl Wright
Assignee: Karl Wright

The code granted by MetaCarta depends on certain specific feature additions
and changes MetaCarta made to some packages it depends upon, specifically
jCIFS, commons-httpclient, and xerces-j. These changes should be percolated
accordingly. They can be found in the tarball under the directory
upstream-diffs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CONNECTORS-12) Need to make use of tabs and spaces consistent in code base

2010-02-18 Thread Karl Wright (JIRA)

[
https://issues.apache.org/jira/browse/CONNECTORS-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright updated CONNECTORS-12:
--

Priority: Minor (was: Major)
Description:
Some java files have tabs, some have spaces. Any individual file has either
one or the other, but not both.

We should decide which one we prefer, or adopt the Apache standard if there is
one, and convert accordingly.

(The jsps are all consistent and use only tabs, which in my opinion should
remain because their mixed nature makes spaces hard to work with in some
editors, like scite.)

was:
Some java files have tabs, some have spaces. We should decide which one we
prefer, or adopt the Apache standard if there is one, and convert accordingly.
(The jsps are all consistent and use only tabs, which in my opinion should
remain because their mixed nature makes spaces hard to work with in some
editors, like scite.)

Need to make use of tabs and spaces consistent in code base
---

Key: CONNECTORS-12
URL: https://issues.apache.org/jira/browse/CONNECTORS-12
Project: Lucene Connector Framework
Issue Type: Task
Reporter: Karl Wright
Priority: Minor

Some java files have tabs, some have spaces. Any individual file has either
one or the other, but not both.
We should decide which one we prefer, or adopt the Apache standard if there
is one, and convert accordingly.
(The jsps are all consistent and use only tabs, which in my opinion should
remain because their mixed nature makes spaces hard to work with in some
editors, like scite.)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-3) Ant build needs to be created for code base

2010-02-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-3.
--

Resolution: Fixed

Ant builds are complete for the java part of the project.  Still need builds 
for C part and for documentation, but will open separate tickets for those.


 Ant build needs to be created for code base
 ---

 Key: CONNECTORS-3
 URL: https://issues.apache.org/jira/browse/CONNECTORS-3
 Project: Lucene Connector Framework
  Issue Type: Task
Reporter: Karl Wright

 The code granted by MetaCarta was built within a debian system.  It would be 
 much more consistent with Apache philosophy to make a self-contained ant 
 build for the code base.  In the future, if debian packages are again 
 required, they could simply wrap the ant build.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CONNECTORS-2) Revamp package names and paths to remove MetaCarta references

2010-02-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-2.
--

Resolution: Fixed

Code reorganized as described.


 Revamp package names and paths to remove MetaCarta references
 -

 Key: CONNECTORS-2
 URL: https://issues.apache.org/jira/browse/CONNECTORS-2
 Project: Lucene Connector Framework
  Issue Type: Task
Reporter: Karl Wright

 The software grant from MetaCarta will not be reorganized prior to the grant, 
 so MetaCarta-specific package and class names will be present.  The code 
 needs to be appropriately rearranged to adhere to Apache package-name 
 standards.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

< 2 3 4 5 6 7

601 - 696 of 696 matches

Mail list logo