Re: The PPMC welcomes Erlend Garåsen as a new ManifoldCF committer!

2011-03-31 Thread Karl Wright
The congratulations are now official! The vote passed and results have been forwarded to the incubator. Karl On Tue, Mar 29, 2011 at 3:11 AM, Karl Wright wrote: > I messed up here because of the confusion between whether our previous > connectors-private discussion constituted a vote

Re: Filtering out unwanted content from HTML pages

2011-03-31 Thread Karl Wright
This is a good question. I think we should carry this conversation forward on connectors-dev. My initial thought on this issue is that the functionality really belongs in Tika. Tika is set up to extract and filter in exactly this way. The only reason you'd want to do it in MCF is if it would ch

Re: [VOTE] Release ManifoldCF-0.2-incubating

2011-03-31 Thread Karl Wright
st related, though. But I think it's worth trying to understand the issue more completely before deciding whether to respin or accept the RC as it stands. Karl On Thu, Mar 31, 2011 at 12:50 PM, Koji Sekiguchi wrote: > (11/03/21 9:32), Karl Wright wrote: >> >> The tag is >&

Re: [VOTE] Release ManifoldCF-0.2-incubating

2011-03-31 Thread Karl Wright
Looking at it in depth, it's likely that it's a Derby bug. I've filed ticket DERBY-5169 accordingly, and marked CONNECTORS-172 as a blocker for 0.2-incubating. Karl On Thu, Mar 31, 2011 at 1:11 PM, Karl Wright wrote: > Yes, this does sound like CONNECTORS-172.  It's int

[RESULT][VOTE] Release ManifoldCF-0.2-incubating

2011-04-01 Thread Karl Wright
Vote failed; a new release candidate with a fix is being respun. This will be RC1. Karl On Thu, Mar 31, 2011 at 1:57 PM, Karl Wright wrote: > Looking at it in depth, it's likely that it's a Derby bug.  I've filed > ticket DERBY-5169 accordingly, and marked CONNECTORS-172 a

[VOTE] Release ManifoldCF-0.2-incubating, RC1

2011-04-01 Thread Karl Wright
RC1 is now available on http://people.apache.org/~kwright/apache-manifoldcf-0.2-incubating. Please check it out and vote! Karl On Fri, Apr 1, 2011 at 3:38 AM, Karl Wright wrote: > Vote failed; a new release candidate with a fix is being respun.  This > will be RC1. > Karl >

Re: [VOTE] Release ManifoldCF-0.2-incubating, RC1

2011-04-02 Thread Karl Wright
+1 also. I ran ant test-pg (yes, the new PostgreSQL test target), ant test, and looked at the javadocs and site docs. Karl On Fri, Apr 1, 2011 at 11:04 PM, Koji Sekiguchi wrote: > (11/04/01 19:24), Karl Wright wrote: >> >> RC1 is now available on >> http://people.apach

JIRA tickets

2011-04-02 Thread Karl Wright
I just did a pass through the open JIRA tickets, and where reasonable provided a summary of the ticket's status, also updating the title if it has become incorrect. Hopefully this will help Erlend, and any other committers who are looking for tickets to tackle. I also created a number of smaller

Re: The MCF FAQ

2011-04-04 Thread Karl Wright
+1 Karl On Mon, Apr 4, 2011 at 9:51 AM, Erlend Garåsen wrote: > > Hello list, > > I have updated the faq and added information regarding Solr and content > extraction. > > Maybe we should create a new header for all the database-specific issues? > Now you will find a lot of information regarding

Re: [VOTE] Release ManifoldCF-0.2-incubating, RC1

2011-04-05 Thread Karl Wright
We only need one more vote! If you are a committer and have any time, please check it out. It still has to run the gauntlet of the incubator even if it passes here... Karl On Sat, Apr 2, 2011 at 6:01 AM, Karl Wright wrote: > +1 also.  I ran ant test-pg (yes, the new PostgreSQL test tar

Re: The MCF FAQ

2011-04-06 Thread Karl Wright
27;re using, these must be set. > "Property.xml properties" would need to be explained in. > Hope this helps. > Thank you. > > Regards, > Shinichiro Abe > > On 2011/04/04, at 22:53, Karl Wright wrote: > >> +1 >> >> Karl >> >> On Mon, A

Re: [VOTE] Release ManifoldCF-0.2-incubating, RC1

2011-04-07 Thread Karl Wright
Still looking for one additional vote... Karl On Tue, Apr 5, 2011 at 8:52 AM, Karl Wright wrote: > We only need one more vote!  If you are a committer and have any time, > please check it out.  It still has to run the gauntlet of the > incubator even if it passes here... > > Karl

[RESULT][VOTE] Release ManifoldCF-0.2-incubating, RC1

2011-04-26 Thread Karl Wright
problem: > https://issues.apache.org/jira/browse/CONNECTORS-188 > > Erlend > > On 01.04.11 12.24, Karl Wright wrote: >> >> RC1 is now available on >> http://people.apache.org/~kwright/apache-manifoldcf-0.2-incubating. >> Please check it out and vote! >>

Re: [CONF] Apache Connectors Framework > FAQ

2011-04-26 Thread Karl Wright
ectors-user@incubator.apache.org/index.html > > http://www.mail-archive.com/connectors-dev@incubator.apache.org/index.html > > http://www.mail-archive.com/general@incubator.apache.org/index.html > > In reply to a comment by Karl Wright: > The news lists are in fact kept around; you can i

[VOTE] Release Apache ManifoldCF 0.2-incubating, RC2

2011-04-26 Thread Karl Wright
The RC2 of the 0.2-incubating release is now up on http://people.apache.org/~kwright. The svn tag is at https://svn.apache.org/repos/asf/incubator/lcf/tags/release-0.2-incubating-RC2. Please vote! Karl

Re: mcf-crawler-ui.war file

2011-04-26 Thread Karl Wright
The target in question compiles the pull agent's crawler UI jsp's to .java files. The classes are then immediately compiled using javac to .class files. Karl On Tue, Apr 26, 2011 at 11:40 PM, wrote: > In the framework build.xml file there is a target called > "compile-crawler-ui", it has a sec

Re: [CONF] Apache Connectors Framework > FAQ

2011-04-26 Thread Karl Wright
I checked in your patch, and also pushed the resulting site changes to the mirror, so they should appear within 24 hours. Karl On Tue, Apr 26, 2011 at 7:48 PM, wrote: > Created ticket CONNECTORS-189 for this.  Marked as minor improvement. > > On Tue, 26 Apr 2011 12:13:50 -0400, Ka

Re: mcf-crawler-ui.war file

2011-04-27 Thread Karl Wright
> On Wed, 27 Apr 2011 01:48:15 -0400, Karl Wright wrote: >> >> The target in question compiles the pull agent's crawler UI jsp's to >> .java files.  The classes are then immediately compiled using javac to >> .class files. >> >> Karl >> >&g

Re: [VOTE] Release Apache ManifoldCF 0.2-incubating, RC2

2011-04-27 Thread Karl Wright
PM, Karl Wright wrote: > The RC2 of the 0.2-incubating release is now up on > http://people.apache.org/~kwright.  The svn tag is at > https://svn.apache.org/repos/asf/incubator/lcf/tags/release-0.2-incubating-RC2. > > Please vote! > > Karl >

Re: Sync Directory

2011-04-28 Thread Karl Wright
No sync directory specified basically means there is no cross-process synchronization. All synchronization is within process. For the Quick Start, which runs in a single process, this is fine. For a multi-process setup, you must have one. If the doc's not clear on this point we should clarify i

Re: Sync Directory

2011-04-28 Thread Karl Wright
I'll open a ticket and also make the patch, once I understand it. > > Sent from my iPhone > > On Apr 28, 2011, at 8:44 AM, Karl Wright wrote: > >> No sync directory specified basically means there is no cross-process >> synchronization.  All synchronization is within

Re: Multi-process setup

2011-04-28 Thread Karl Wright
connectors.xml only applies to the Quick Start. All connectors are registered every time you run the process. For a multiprocess setup, there is no single process this should occur on, and furthermore, it is more technically correct to register a connector only once. Some of the steps of registr

Re: Starting Agent

2011-04-28 Thread Karl Wright
It sounds OK to me. Karl On Thu, Apr 28, 2011 at 5:55 PM, wrote: > When I run the agent process, I get the following messages: > > Running... > Configuration file successfully read > > and then it is just waiting.  Am I suppose to get a message saying "Agent > Started" or something more than the

Re: Agent Process in Eclipse

2011-04-28 Thread Karl Wright
Because the sync dir is used by ALL the running processes, it is not safe to have just ONE clean up the area on startup or shutdown. My thought is that since the AgentStop process uses the synch area too, you are neglecting to supply the correct -Dorg.apache.manifoldcf.configfile switch to it so t

Re: [VOTE] Release Apache ManifoldCF 0.2-incubating, RC2

2011-04-28 Thread Karl Wright
gt; (11/04/27 7:12), Karl Wright wrote: >> >> The RC2 of the 0.2-incubating release is now up on >> http://people.apache.org/~kwright.  The svn tag is at >> >> https://svn.apache.org/repos/asf/incubator/lcf/tags/release-0.2-incubating-RC2. >> >> Please vote! &g

[RESULT][VOTE] Release Apache ManifoldCF 0.2-incubating, RC2

2011-05-01 Thread Karl Wright
I count 4 +1's, no -1's. Vote passes! I'll request a vote in the incubator next. Karl On Fri, Apr 29, 2011 at 9:48 AM, Grant Ingersoll wrote: > +1 > > On Apr 26, 2011, at 6:12 PM, Karl Wright wrote: > >> The RC2 of the 0.2-incubating release is now up on >

Re: Connector Transaction Data

2011-05-02 Thread Karl Wright
Oh, but you might glean something from the Chapter 9 example, with is still under development but may have enough stuff in it to be interesting. http://manifoldcfinaction.googlecode.com/svn/trunk/edition_1/output_connector_example/src/org/apache/manifoldcf/agents/output/docs4u/Docs4UOutputConnecto

"Book release"

2011-05-02 Thread Karl Wright
Once the 0.2-incubating release goes out the door, I'd like to propose that the next release be considered a ManifoldCF in Action "book release". Basically this will mean that we need a release that is consistent with the examples and explanations in the book, before the book actually is done. 0.

Re: Agent Process in Eclipse

2011-05-02 Thread Karl Wright
t;        at >  org.apache.manifoldcf.core.database.Database.execute(Database.java:566) >        at >  org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421) >  PooledConnection.guardConnection(): found closed Connection. Statement >  information follows. Atte

Re: document.getBinaryStream()

2011-05-02 Thread Karl Wright
Using a temporary file is the right approach. Karl On Mon, May 2, 2011 at 12:30 PM, wrote: > So I just learnt that you can not reuse a stream.  In my code I create a > hash from the file content, and then I want to sent it via http to the > archive server.  You guessed it the InputStream is no

Re: document.getBinaryStream()

2011-05-02 Thread Karl Wright
e right return code.  Right now I'm choosing between > DOCUMENTSTATUS_ACCEPTED or DOCUMENTSTATUS_REJECTED, what is the right way of > communicating an internal error in the connector? > > On Mon, 2 May 2011 12:50:07 -0400, Karl Wright wrote: >> >> Using a temporary file i

Re: Agent Process in Eclipse

2011-05-02 Thread Karl Wright
urce called > Eclipse and store the relevant parts in there that would go along with the > documented setup steps. > > On Mon, 2 May 2011 09:00:32 -0400, Karl Wright wrote: >> >> If you have an eclipse settings file or documentation, please consider >> contribut

Re: Interface Calls

2011-05-02 Thread Karl Wright
The book explains all of this in great detail, as well as logging conventions. This all begins in Chapter 5, which is the next chapter to be released via MEAP. Logging for an output connector should use the org.apache.manifoldcf.agents.system.Logging.ingest logger. You enable that logger with th

Re: Connector Transaction Data

2011-05-02 Thread Karl Wright
println(Thread.currentThread().getStackTrace()[1].getMethodName() > + " new ID assigned [" + currentContext.get("id") + "]"); >                        } else { > >  //System.out.println(Thread.currentThread().getStackTrace()[1].getMethodName() > + " already has an id set to ["

Re: "Book release"

2011-05-04 Thread Karl Wright
m > not sure what the ASF policy is on that stuff) > > -Grant > > On May 2, 2011, at 8:59 AM, Karl Wright wrote: > >> Once the 0.2-incubating release goes out the door, I'd like to propose >> that the next release be considered a ManifoldCF in Action "book >> re

A cloud-based testing environment?

2011-05-10 Thread Karl Wright
I've pretty much given up on getting access to the testing infrastructure I originally built for ManifoldCF that is now owned by qBase. I was wondering if anyone had time or energy to research what it would take to build such an infrastructure using Amazon cloud servers. Basically, we'd need at l

Re: A cloud-based testing environment?

2011-05-10 Thread Karl Wright
hat about Windows Azure? > It seems that it can deploy MS products and use java, > I do not know the details though. > Isn't it relevant? > > Regards, > Shinichiro Abe > > On 2011/05/10, at 20:08, Karl Wright wrote: > >> I've pretty much given up on get

Re: A cloud-based testing environment?

2011-05-11 Thread Karl Wright
ive-directory-%E3%83%89%E3%83%A1%E3%82%A4%E3%83%B3%E3%81%AB%E5%8F%82%E5%8A%A0/ > > Shinichiro Abe > > On 2011/05/11, at 9:22, Karl Wright wrote: > >> I seem to recall that Azure is available for free though MSDN, which >> means it is likely to be affordable.  Can you f

The ManifoldCF PPMC welcomes Shinichiro Abe as a ManifoldCF committer!

2011-05-13 Thread Karl Wright
Please join us in welcoming Shinichiro to the ManifoldCF team! Karl

Re: "Book release"

2011-05-13 Thread Karl Wright
incubator list. So please consider doing so! Thanks, Karl On Wed, May 4, 2011 at 11:29 AM, Karl Wright wrote: > I have no problem with a ManifoldCF 0.3-incubating release, and I > agree that technically a book release has nothing to do with a > software release.  But community buil

Tommaso Teofili has joined ManifoldCF as a new mentor!

2011-05-15 Thread Karl Wright
Please join me in thanking Tommaso for taking on this responsibility! Karl

[ANNOUNCE] ManifoldCF 0.2-incubating released!

2011-05-17 Thread Karl Wright
The mirrors should update soon, and the site also within 24 hours. Enjoy, and thanks to all who helped on this! Karl

The burning technical issues of the day

2011-05-17 Thread Karl Wright
For those who have just entered the ManifoldCF project, I'd like to first extend my congratulations once again! You are probably still trying to figure out exactly what's going on and where we are going. Unfortunately, this being Apache, I cannot actually answer your question, because you are par

Re: The burning technical issues of the day

2011-05-17 Thread Karl Wright
ign time" opinion is that we should try to mock those > systems. > Did you try to mock any of them yet? Does this sound good/bad to you? > Regards, > Tommaso > > > 2011/5/17 Karl Wright > >> For those who have just entered the ManifoldCF project, I'd like to

Re: The burning technical issues of the day

2011-05-19 Thread Karl Wright
>>> infrastructure would possibly expose ManifoldCF to this issue again in >>> case >>> of maintenance, versions evolution and so on therefore, although I still >>> have to investigate if that can be sorted technically for the supported >>> systems, my &quo

Re: poll method

2011-05-23 Thread Karl Wright
The preferred way to set up connections is by having all methods that require a set-up connection to call a getSession() method. This is in fact pretty much enforced by the fact that connect() cannot throw a ManifoldCFException. Chapter 6 of ManifoldCF in Action describes the preferred form via a

Re: poll method

2011-05-23 Thread Karl Wright
> connector since we have a multi threaded system.  I was using my made up id > to sort out say all the log lines for the connector that has thread 3 set as > its context. > > On Mon, 23 May 2011 15:45:26 -0400, Karl Wright wrote: >> >> The preferred way to set up co

Re: Turning on Logging

2011-05-24 Thread Karl Wright
Karl On Tue, May 24, 2011 at 3:08 PM, wrote: > What is the properties.xml line to turn on logging say for connectors and > db.  Is it the following?  Do you say value = true or enabled or on? > > > > > Thanks, > Farzad. >

Re: Turning on Logging

2011-05-24 Thread Karl Wright
nder.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n > > log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender > log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout > log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n > > > On Tue, 24 M

Re: Turning on Logging

2011-05-24 Thread Karl Wright
%5p [%t] (%F:%L) - %m%n > > > On Tue, 24 May 2011 15:20:13 -0400, Karl Wright wrote: >> >> The rootLogger in the .ini file may be set to DEBUG, but the >> properties.xml loggers explicitly control each subsystem, so those win >> (for subsystems). >> >&g

Re: Turning on Logging

2011-05-24 Thread Karl Wright
ing.connectors.debug("DupFinderConnector Constructor > Called"); >                System.out.println("DUP CONS CALLED"); > > Thanks! > > On Tue, 24 May 2011 15:41:07 -0400, Karl Wright wrote: >> >> The sample app logging.ini file was already updated;

Re: Turning on Logging

2011-05-25 Thread Karl Wright
r first > or the agent process?  The reason I ask, I keep ending up with sub folders > in the sync dir, should a clean termination result in an empty sync dir? > > On Tue, 24 May 2011 19:31:38 -0400, Karl Wright wrote: >> >> Check to be sure your properties.xml file didn't

Re: Fatal Error

2011-05-25 Thread Karl Wright
My guess would be inadvertant cross-thread object sharing again. Nothing significant has changed in ManifoldCF in this area in a long while. Karl On Wed, May 25, 2011 at 6:10 PM, wrote: > I'm getting some very strange errors internal errors.  I'd like to say I > haven't done something, but some

Re: Fatal Error

2011-05-26 Thread Karl Wright
I've had a moment to look at this in more detail. The line where this failed indicates that the getConfiguration() method for for your connector is returning null. Hope this helps. Karl On Wed, May 25, 2011 at 6:14 PM, Karl Wright wrote: > My guess would be inadvertant cross-threa

Re: Fatal Error

2011-05-26 Thread Karl Wright
r.setThreadContext(threadContext); >                } catch (ManifoldCFException e) { >                        e.printStackTrace(); >                } >        } > > > On Wed, 25 May 2011 18:14:29 -0400, Karl Wright wrote: >> >> My guess would be inadvertant cross-thread o

Criteria for graduation?

2011-05-26 Thread Karl Wright
Does anyone have a firm sense of what additional milestones will be necessary for ManifoldCF to meet to graduate from the incubator? Karl

Re: Fatal Error

2011-05-26 Thread Karl Wright
ng parms I'm not? > > On 5/26/2011 7:52 AM, Karl Wright wrote: >> >> Is it possible for your connector to return a null value from a >> getConfiguration() method call?  This would be unlikely if it extended >> BaseOutputConnector, but maybe it does not. >> &

Re: Fatal Error

2011-05-26 Thread Karl Wright
o I need to call the super method for disconnect, install, deinstall? > Any others?  I noticed adding it to disconnect I had to modify the method > signature to throw the ManifoldCFException. > > On 5/26/2011 9:19 AM, Karl Wright wrote: >> >> Here's your problem: >>

Re: Fatal Error

2011-05-26 Thread Karl Wright
Valad wrote: > So even for addOrReplaceDocument, removeDocument, and getActivitiesList I > should call the super? > > On 5/26/2011 10:06 AM, Karl Wright wrote: >> >> In general, if you are extending BaseOutputConnector, you should >> either not implement a method at

Re: Logging Hierarchy

2011-05-26 Thread Karl Wright
(1) Logging. The way the logging works is that there are ManifoldCF system loggers, and everything else. System loggers are the ones that you talk about in the properties.xml file. The logging level (and ONLY the logging level) is overridden for the system loggers by means of the properties.xml

Re: Logging Hierarchy

2011-05-26 Thread Karl Wright
messages to be logged. Karl On Thu, May 26, 2011 at 7:58 PM, Karl Wright wrote: > (1) Logging.  The way the logging works is that there are ManifoldCF > system loggers, and everything else.  System loggers are the ones that > you talk about in the properties.xml file.  The logging level (a

Re: Criteria for graduation?

2011-05-30 Thread Karl Wright
though >> maybe still not enough to exit incubator? Not sure. >> >> Is http://incubator.apache.org/connectors/who.html up to date?  No, looks >> like >> it's not... >> >> Otis >> >> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutc

[VOTE] Adopt Java 1.5 as the minimum Java release for ManifoldCF

2011-05-30 Thread Karl Wright
Please have a look at CONNECTORS-203 and vote +1 if you think it's time to move beyond Java 1.4 at the source level, to Java 1.5. I did some work on this over the weekend and managed to convince myself that a migration to a newer Java version will have no obvious ill effects. But I'd like your tho

My ManifoldCF talk has been accepted for ApacheCon North America 2011 in Vancouver

2011-05-31 Thread Karl Wright
I'll be giving a 45-minute introductory talk in Vancouver at ApacheCon North America, some time between November 9 and November 11, 2011. If anyone has any particular detail or issue they would like to see in the talk, I'd be happy to entertain your suggestion. Please let me know. Karl

Fwd: Incubator PMC/Board report for June 2011 (connectors-dev@incubator.apache.org)

2011-06-01 Thread Karl Wright
The March report looked like this: ManifoldCF --Description-- ManifoldCF is an incremental crawler framework and set of connectors designed to pull documents from various kinds of repositories into search engine indexes or other targets. The current bevy of connectors includes Documentum (EMC),

Re: Incubator PMC/Board report for June 2011 (connectors-dev@incubator.apache.org)

2011-06-01 Thread Karl Wright
d off by mentor: Karl On Wed, Jun 1, 2011 at 10:23 AM, Tommaso Teofili wrote: > I think the successful release should be mentioned too :-) > Tommaso > > 2011/6/1 Karl Wright > >> The March report looked like this: >> >> ManifoldCF >> >> --Description--

Re: DB Error

2011-06-01 Thread Karl Wright
The transaction error is the result of the id generation error. That error is the result of a corrupted file in the sync area. I suggest you use LockClean to clean your locks, after shutting down all processes. Then, restart everything. If you get this error again it is because of cross-thread

Re: DB Error

2011-06-01 Thread Karl Wright
ethods to pass the TCs along.  I never store > the TCs or pass along any data through them.  How do I very that I'm doing > or not doing any cross-thread use of TCs? > > Thanks, > Farzad. > > On 6/1/2011 1:35 PM, Karl Wright wrote: >> >> The transaction e

Re: handles, threads, and workers

2011-06-01 Thread Karl Wright
There is a formula on the how-to-build-and-deploy.html page. You can lower the number of threads without issue but you must have enough database handles so you don't starve the threads of handles. Each thread can use a handle at a time. Karl On Wed, Jun 1, 2011 at 3:07 PM, Farzad Valad wrote:

Re: Incubator PMC/Board report for June 2011 (connectors-dev@incubator.apache.org)

2011-06-02 Thread Karl Wright
I've now edited the page accordingly. Let me know of any changes you'd like to see. Karl On Thu, Jun 2, 2011 at 4:18 AM, Tommaso Teofili wrote: > it sounds good to me, any others? > Tommaso > > 2011/6/1 Karl Wright > >> Here's my proposed text:

ManifoldCF now officially requires Java 1.5

2011-06-02 Thread Karl Wright
Hi everyone, I've checked in changes that move ManifoldCF from mostly the Java 1.4 world into the Java 1.5 world. This should introduce no compilation errors in user connector code, but most people will need to do a clean recompile to get a working system again. Please let me know ASAP if anyone

[RESULT][VOTE] Adopt Java 1.5 as the minimum Java release for ManifoldCF

2011-06-02 Thread Karl Wright
Although it hasn't been the quite required 3 days, this vote isn't binding anyway, so I'm going to declare it closed and commit the code. Karl On Mon, May 30, 2011 at 7:32 PM, Karl Wright wrote: > Please have a look at CONNECTORS-203 and vote +1 if you think it's > ti

Re: CrawlerCommons & ManifoldCF

2011-06-02 Thread Karl Wright
Absolutely! We're a bit thin on active committers at the moment, which will probably limit our ability to take any highly active roles in your development process. But we do have a pile of code which you might be able to leverage, and once there is common functionality available I think we'd all p

Re: CrawlerCommons & ManifoldCF

2011-06-02 Thread Karl Wright
ld be > shared and would not take too much effort to be made generic. I haven't > looked to the code of the crawler in great details but do you think the > robots parser would be a good candidate? > > Julien > > On 2 June 2011 16:23, Karl Wright wrote: > >> Absolut

Re: Exception Handling

2011-06-03 Thread Karl Wright
Your choice of exception would have been fine if this was a repository connector, but output connectors do not have the same ability to abort jobs via ManifoldCFExceptions at this time. (You can create a ticket if you think this is how it should work). But if you want the job to abort, you probab

Re: Exception Handling

2011-06-03 Thread Karl Wright
for this ManifoldCFException type I'm having a hard time recollecting; but I seem to recall vaguely it had something to do with the LiveLink connector. I'll post later if it comes back to me. Karl On Fri, Jun 3, 2011 at 1:11 PM, Karl Wright wrote: > Your choice of exception would have

Re: Exception Handling

2011-06-03 Thread Karl Wright
e livelink server, and in the case of CIFS, by fixing a too-short timeout in jcifs. So, in theory, this retry logic could be removed. I'll create a ticket to research this further. Karl On Fri, Jun 3, 2011 at 1:29 PM, Karl Wright wrote: > Actually, looking at the code, the REPOSITO

Re: Exception Handling

2011-06-03 Thread Karl Wright
CONNECTORS-207 describes the situation. Karl On Fri, Jun 3, 2011 at 1:41 PM, Karl Wright wrote: > I remember now. > The problem was that the LiveLink API code, under certain conditions, > "lied" about the error it got back from the server.  Under these > conditions,

Re: Exception Handling

2011-06-03 Thread Karl Wright
rom your reply I > got the impression that the user will get the choice to skip or abort, then > what do you set these parms to? 0?  Thanks! > > On 6/3/2011 12:11 PM, Karl Wright wrote: >> >> Your choice of exception would have been fine if this was a repository >> connec

Re: Strange Exception

2011-06-05 Thread Karl Wright
I would guess that dataManager is null. The only other possibility is that document is null, and I don't think that can happen. Karl On Fri, Jun 3, 2011 at 4:11 PM, Farzad Valad wrote: > So I've been trying to figure this out for days now and still not even > close.  So I'm getting this in th

Travel assistance, ApacheCon NA 2011

2011-06-06 Thread Karl Wright
The Apache Software Foundation (ASF)'s Travel Assistance Committee (TAC) is now accepting applications for ApacheCon North America 2011, 7-11 November in Vancouver BC, Canada. The TAC is seeking individuals from the Apache community at-large --users, developers, educators, students, Committers, an

Re: Data Manager Null

2011-06-07 Thread Karl Wright
It sounds like you are on the right track for fixing all of these problems. Karl On Tue, Jun 7, 2011 at 4:38 PM, Farzad Valad wrote: > I think I found the problem.  I should be tearing down the dataManager and > recreating it between clear and set thread context calls, because it has a > thread

Re: Aborting State

2011-06-07 Thread Karl Wright
The cross-thread issues you were having with your connector would certainly have affected database access in a significant way, so this symptom could well be one result of that problem. Karl On Tue, Jun 7, 2011 at 10:20 AM, Farzad Valad wrote: > Lately when I issue an abort on a crawl job (click

Re: Data Manager Null

2011-06-07 Thread Karl Wright
ReplaceDocument.  This was something you recommended when I was > asking about the third party repository. > > Farzad. > > On 6/7/2011 4:35 PM, Karl Wright wrote: >> >> It sounds like you are on the right track for fixing all of these >> problems. >> >> Ka

Re: Data Manager Null

2011-06-07 Thread Karl Wright
ddOrReplaceDocument.  The other caveat is that >> I'll make dataManager a class variable, instead of static.  So each object >> would have its own instance with its TC, and in clearTC they'd be nulling >> their version an not anyone else's. >> >> Do I get

Re: RegisterOutput Error

2011-06-08 Thread Karl Wright
The code is: Throwable z = e.getTargetException(); if (z instanceof Error) throw (Error)z; else throw (ManifoldCFException)z; The problem cannot be that z is null, because "z instanceof Error" does not blow up. Indeed: "java.lang.NullPointerException cannot be

Re: RegisterOutput Error

2011-06-08 Thread Karl Wright
Ok, I have checked in a fix for the RuntimeException handling. If you try the new code, you should get a full trace for the NPE that is causing the problem. Karl On Wed, Jun 8, 2011 at 3:20 PM, Karl Wright wrote: > The code is: > >      Throwable z = e.getTargetException(); >

Re: RegisterOutput Error

2011-06-08 Thread Karl Wright
er saw this error till setting up a > new system.  I guess I can't log inside the constructor? > > On 6/8/2011 2:34 PM, Karl Wright wrote: >> >> Ok, I have checked in a fix for the RuntimeException handling.  If you >> try the new code, you should get a full trace for

Re: OpenSearchServer output connector

2011-06-09 Thread Karl Wright
Thanks very much! For developing an output connector, I would highly recommend getting hold of ManifoldCF in Action. Chapter 9 of that book describes how to construct an output connector, and Chapter 6 describes the rules for connectors in general. You can buy into the Early Access Program here:

Re: CMIS Connector

2011-06-13 Thread Karl Wright
Yes, a CMIS connector would be very welcome, especially if you yourself have reason to use it. If you want to contribute it, please follow the directions at: https://cwiki.apache.org/confluence/display/CONNECTORS/HowToContribute It's especially important to contribute your connector as a patch vi

Re: MySql DBInterface problem on getTableSchema

2011-06-20 Thread Karl Wright
Rather than change the database contract, which would have far-reaching effects, is there any way to simply implement getTableSchema to work properly with the abstraction? For example, read the result of the DESCRIBE within the getTableSchema method and translate it in whatever manner is needed.

Re: Excluding html files and following links

2011-06-20 Thread Karl Wright
Hi Erlend, The inclusions and exclusions are based solely on URL, and block the connector from fetching the file. Otherwise you would easily wind up fetching the entire web. However, this raises an interesting issue as to whether there's a way in the web connector to do what you are trying to do

Re: Excluding html files and following links

2011-06-21 Thread Karl Wright
he dev list in order to get some feedback on this > issue. > > Erlend > > On 20.06.11 18.00, Karl Wright wrote: >> >> Hi Erlend, >> >> The inclusions and exclusions are based solely on URL, and block the >> connector from fetching the file.  Otherwise y

Re: Excluding html files and following links

2011-06-23 Thread Karl Wright
Have there been any further developments on this thread? Karl On Tue, Jun 21, 2011 at 6:08 AM, Karl Wright wrote: > Sure.  But you've already convinced me we need a new feature. ;-) > > Karl > > On Tue, Jun 21, 2011 at 3:50 AM, Erlend Garåsen > wrote: >> >&g

Re: Excluding html files and following links

2011-06-23 Thread Karl Wright
nts which exceed a preset size. We have discovered pdfs on 500 > MB. What do you think? Do we need such a future as well? > > Erlend > > On 23.06.11 12.08, Karl Wright wrote: >> >> Have there been any further developments on this thread? >> Karl >> >> On Tue,

Re: Sync Dir

2011-07-05 Thread Karl Wright
Hi Farzad - any luck on getting that stack trace? Karl On Sat, Jul 2, 2011 at 1:09 PM, daddy...@gmail.com wrote: > The unique key violation is not expected - if you could send along a complete > stack trace that would be good. > > The lock clean procedure is to shut down all mcf processes, execu

Re: Excluding html files and following links

2011-07-05 Thread Karl Wright
Have you had a look at the feature added, and does it work for you? I'd also still be interested in knowing where you are seeing out-of-memory situations. Karl On Thu, Jun 23, 2011 at 8:03 AM, Karl Wright wrote: > Hi Erlend, > > I hope you are not seeing memory issues on la

Re: Sync Dir

2011-07-05 Thread Karl Wright
gresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:305) >    at > org.apache.manifoldcf.core.database.Database.execute(Database.java:606) >    at > org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421) > > On 7/5/2011

Re: Sync Dir

2011-07-05 Thread Karl Wright
Also, if you need a unique ID, I suggest that you call ManifoldCF's unique ID generator. Karl On Tue, Jul 5, 2011 at 10:29 AM, Karl Wright wrote: > It does seem to be in your code. > > Try psql.  The \d command should list indexes. > > Karl > > On Tue, Jul 5, 2011

Re: Sync Dir

2011-07-05 Thread Karl Wright
ails. >> >> My solution, was to make a loop, initially infinite, but then I decided to >> put in a counter of 50 tries.  The chances that 50 threads at the same time >> try to insert same hashsum should be less than me winning the lottery : ) >> >> On 7/5/2011 1

  1   2   3   4   5   6   7   8   9   10   >