[
https://issues.apache.org/jira/browse/CONNECTORS-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13691635#comment-13691635
]
Karl Wright commented on CONNECTORS-727:
----------------------------------------
I had a more detailed look at the code. Here are some of my suggestions.
(1) this needs to be integrated into the ant build system. It needs a
build.xml file in connectors/generic, which needs to be called from the root
build.xml.
(2) it looks like you can remove the src directory directly under generic.
(3) the following UI code:
{code}
+ "<tr>"
+ "<th></th>"
+ "<th>" + Messages.getBodyString(locale, "generic.ParameterName") +
"</th>"
+ "<th>" + Messages.getBodyString(locale, "generic.ParameterValue") +
"</th>"
+ "</tr>");
{code}
... is not using the standard ManifoldCF stylesheet. It's pretty important
that the UI stick with one of the established paradigms, otherwise you may get
unexpected results on different browsers. There IS an established table
paradigm; see the web connector for examples as to how you use it.
(4) Whenever there is a possibility of a worker thread waiting on a socket, it
is wise to use a background thread instead, otherwise ManifoldCF agents process
becomes difficult to shut down cleanly (and winds up needing to be killed).
For example: getDocumentVersions() does an Http client.execute() right in the
worker thread. You did this right for the seeding interaction; now you need it
in all other such places too.
(5) This code is clumsy and you could do this trivially in memory, rather than
going through a temp file:
{code}
File temp = File.createTempFile("manifold", ".tmp");
temp.deleteOnExit();
try {
FileUtils.writeStringToFile(temp, item.content);
FileInputStream is = new FileInputStream(temp);
doc.setBinary(is, temp.length());
activities.ingestDocument(documentIdentifiers[i], versions[i],
item.url, doc);
is.close();
} finally {
temp.delete();
}
{code}
The temp file is also using the default encoding, which is not I think what you
want (probably you want UTF-8). Also, the mime type (if what you have for
content is a string) should be text/plain; it is not clear that that is what is
happening.
(6) When you catch IOException, there are a lot of cases you need to handle.
For example, InterruptedIOException should generate a
ManifoldCFException.INTERRUPTED exception, while most other IOExceptions should
generate ServiceInterruptions. You also need to catch exceptions derived from
InterruptedIOException which do NOT happen due to thread interrupt, such as
SocketTimeoutException and ConnectTimeoutException, and treat those like
IOException rather than InterruptedIOException. I recommend you see the
DropBox connector - there is a method handleIOException you may want to look at.
I'm going to stop there for the moment; please let me know if you disagree with
any of these suggestions.
> generic connector
> -----------------
>
> Key: CONNECTORS-727
> URL: https://issues.apache.org/jira/browse/CONNECTORS-727
> Project: ManifoldCF
> Issue Type: Improvement
> Reporter: Maciej Lizewski
> Assignee: Karl Wright
>
> OK, this is tricky, but really nice idea. I was thinking about indexing some
> sources which do not have API, or API does not provide information needed by
> Manifold, or there is dedicated system and IT team that can easily add some
> API.
> Now you have to write dedicated connector and probably some API extension,
> plugin, etc that would talk with each other to provide seeds, versions and
> documents. Which requires knowledge on how to write Manifold connectors AND
> knowledge about system - there is no so many programmers that know both
> systems :)
> So lets make things easier - provide "generic" Manifold connector that works
> with "generic" API (ie XML over HTTP which is *really* easy to implement in
> any language). This API and protocol are strictly defined and specified. Then
> to integrate with the custom document repository one has to only implement
> API entry point which follow those specifications.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira