Well I have extended BaseOutputConnector and not overriding
getConfiguration(). I don't know why this would be a problem now, but
should I be overriding a method or supplying parms I'm not?
On 5/26/2011 7:52 AM, Karl Wright wrote:
Is it possible for your connector to return a null value from a
getConfiguration() method call? This would be unlikely if it extended
BaseOutputConnector, but maybe it does not.
Karl
On Thu, May 26, 2011 at 8:43 AM,<[email protected]> wrote:
So I put log statements in all my methods, the last one called is
setThreadContext. Also I'm not sharing objects in threads, removed the id
code all together. May something is corrupt in the db tables? I'm just
trying edit an existing job. I could try zapping the db table and starting
over.
public void setThreadContext(IThreadContext threadContext) {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Connection handle
requested");
}
try {
super.setThreadContext(threadContext);
} catch (ManifoldCFException e) {
e.printStackTrace();
}
}
On Wed, 25 May 2011 18:14:29 -0400, Karl Wright<[email protected]> wrote:
My guess would be inadvertant cross-thread object sharing again.
Nothing significant has changed in ManifoldCF in this area in a long
while.
Karl
On Wed, May 25, 2011 at 6:10 PM,<[email protected]> wrote:
I'm getting some very strange errors internal errors. I'd like to say I
haven't done something, but something must of changed since the last
time.
Any ideas where I should be looking? Thanks!
SEVERE: Servlet.service() for servlet [jsp] in context with path
[/mcf-crawler-ui] threw exception [java.lang.NullPointerException] with
root
cause
java.lang.NullPointerException
at
org.apache.manifoldcf.agents.interfaces.OutputConnectorFactory$PoolKey.hashCode(OutputConnectorFactory.java:491)
at java.util.HashMap.get(Unknown Source)
at
org.apache.manifoldcf.agents.interfaces.OutputConnectorFactory.release(OutputConnectorFactory.java:395)
at org.apache.jsp.editjob_jsp._jspService(editjob_jsp.java:606)
at
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:419)
at
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:391)
at
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:304)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:164)
at
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:462)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:562)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:395)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:250)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:188)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:302)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
/* $Id$ */
/**
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.manifoldcf.agents.output.dupfinder;
import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.Arrays;
import java.util.Calendar;
import java.util.List;
import org.apache.log4j.Level;
import org.apache.manifoldcf.agents.interfaces.IOutputAddActivity;
import org.apache.manifoldcf.agents.interfaces.IOutputRemoveActivity;
import org.apache.manifoldcf.agents.interfaces.OutputSpecification;
import org.apache.manifoldcf.agents.interfaces.RepositoryDocument;
import org.apache.manifoldcf.agents.interfaces.ServiceInterruption;
import org.apache.manifoldcf.core.interfaces.ConfigParams;
import org.apache.manifoldcf.core.interfaces.DBInterfaceFactory;
import org.apache.manifoldcf.core.interfaces.IDBInterface;
import org.apache.manifoldcf.core.interfaces.IThreadContext;
import org.apache.manifoldcf.core.interfaces.ManifoldCFException;
import org.apache.manifoldcf.core.system.ManifoldCF;
import org.apache.manifoldcf.crawler.system.Logging;
/**
* This is a null output connector. It eats all output and simply logs the
* events.
*/
public class DupFinderConnector extends
org.apache.manifoldcf.agents.output.BaseOutputConnector {
public static final String _rcsid = "@(#)$Id$";
/** Ingestion activity */
public final static String INGEST_ACTIVITY = "document ingest";
/** Document removal activity */
public final static String REMOVE_ACTIVITY = "document deletion";
private DataManager dataManager = null;
private CIConnector ciConnector = null;
private String ciURL = "http://repository:34544/ci";
private String ciSID = null;
private Calendar calendar = Calendar.getInstance();
private HashsumGenerator hashGen = new HashsumGenerator();
private static int rcdCounter = 1;
public DupFinderConnector() {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Connector class instance
requested");
}
}
public void connect(ConfigParams configParams) {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Connection instance
requested");
}
}
public void disconnect() {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Releaseing connection
instance");
}
}
/**
* Return the list of activities that this connector supports (i.e.
writes
* into the log).
*
* @return the list.
*/
public String[] getActivitiesList() {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Executing " +
Thread.currentThread().getStackTrace()[1].getMethodName());
}
return new String[] { INGEST_ACTIVITY, REMOVE_ACTIVITY };
}
private File createTemporaryFile(String name, InputStream input) throws
IOException {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Creating temporary file with
name = " + name);
}
// Create temp file.
File tmpFile = File.createTempFile(name, ".swcr.tmp");
// Delete temp file when program exits.
tmpFile.deleteOnExit();
// Write to temp file
FileOutputStream out = new FileOutputStream(tmpFile);
byte[] buffer = new byte[65536];
BufferedInputStream in = new BufferedInputStream(input);
try {
while (in.read(buffer) != -1) {
out.write(buffer);
}
} catch (IOException e) { // TODO Auto-generated catch block
e.printStackTrace();
if (Logging.connectors.isEnabledFor(Level.FATAL)) {
Logging.connectors.debug("Error reading the
source file or writing to the tmpFile of \"" + tmpFile.getCanonicalPath()
+"\"");
}
}
out.close();
return tmpFile;
}
public int addOrReplaceDocument(String documentURI, String
outputDescription, RepositoryDocument document, String authorityNameString,
IOutputAddActivity activities) throws ManifoldCFException, ServiceInterruption {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Executing " +
Thread.currentThread().getStackTrace()[1].getMethodName());
}
getSession();
InputStream inputStream = null;
try {
File tmpFile =
createTemporaryFile(documentURI.replaceAll("[\\\\/:\"*?<>|]+", "_"),
document.getBinaryStream());
inputStream = new FileInputStream(tmpFile);
byte[] hashsum = hashGen.digestFile(inputStream);
inputStream.close();
String hashsumHexValue = hashGen.getHex(hashsum);
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Attempting to insert
document with properties of: rcdCounter=" + rcdCounter + ", documentURI=" +
documentURI + ", outputDescription=" + outputDescription + ",
authorityNameString=" + authorityNameString + ", binaryLength=" +
document.getBinaryLength() + ", dupNum=" + 1 + ", hashSumValue=" +
hashsumHexValue);
}
inputStream = new FileInputStream(tmpFile);
String timeStamp = calendar.getTime().toString();
boolean isDuplicate = dataManager.insertData(timeStamp,
rcdCounter++, documentURI, outputDescription, authorityNameString,
document.getBinaryLength(), 1, hashsumHexValue, inputStream);
if (!isDuplicate) {
uploadDocumentToCI(timeStamp, documentURI,
hashsumHexValue, inputStream);
}
activities.recordActivity(null, INGEST_ACTIVITY, new
Long(document.getBinaryLength()), documentURI, "OK", (isDuplicate?"Duplicate
Found":""));
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
inputStream.close();
} catch (IOException e) {
if
(Logging.connectors.isEnabledFor(Level.FATAL)) {
Logging.connectors.debug("Failed to
close the inputStream." + inputStream);
}
}
}
return DOCUMENTSTATUS_ACCEPTED;
}
public void uploadDocumentToCI(String timestampVal, String docURIVal,
String hashsumVal, InputStream docInputStream) throws ManifoldCFException {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Executing " +
Thread.currentThread().getStackTrace()[1].getMethodName());
}
int slashIndex = docURIVal.lastIndexOf("/");
String filename = docURIVal.substring(slashIndex + 1);
String filepath = docURIVal.substring(0, slashIndex);
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Attempting CI upload for
filename = \"" + filename + "\", filepath = \"" + filepath + "\"");
}
List<UploadParm> uploads = Arrays.asList(new UploadParm[] {
new UploadParm(filename, docInputStream) });
List<RequestParm> parms = Arrays.asList(new RequestParm[] {
new RequestParm("sid", ciSID),
new RequestParm("tplid", "A20000"),
new RequestParm("filename", filename),
new RequestParm("filepath", filepath),
new RequestParm("datecreated", timestampVal),
new RequestParm("datemodified", timestampVal),
new RequestParm("dateaccessed", timestampVal),
new RequestParm("hashcode", hashsumVal) });
CIResponse ciRsp = ciConnector.sendTxn("createtopic", parms,
uploads);
if (ciRsp.isError()) {
String msg = "Upload failed for \"" + filename + "\",
filepath = \"" + filepath + "\", ciURL=" + ciConnector.getURL() + ", rspXML = "
+ ciRsp.getXML();
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug(msg);
}
throw new ManifoldCFException(msg,
ManifoldCFException.REPOSITORY_CONNECTION_ERROR);
} else {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Upload succeeded for
\"" + filename + "\", filepath = \"" + filepath + "\", ciSID=" + ciSID + ",
ciURL=" + ciConnector.getURL() + ", rspXML = " + ciRsp.getXML());
}
}
}
public String getOutputDescription(OutputSpecification spec) throws
ManifoldCFException {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Executing " +
Thread.currentThread().getStackTrace()[1].getMethodName());
}
return "TODO";
}
public void removeDocument(String documentURI, String
outputDescription, IOutputRemoveActivity activities) throws
ManifoldCFException, ServiceInterruption {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Executing " +
Thread.currentThread().getStackTrace()[1].getMethodName());
}
activities.recordActivity(null, REMOVE_ACTIVITY, null,
documentURI, "OK", null);
}
protected void getSession() throws ManifoldCFException {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Executing " +
Thread.currentThread().getStackTrace()[1].getMethodName());
}
if (dataManager == null) {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Establishing
DataManager instance.");
}
IDBInterface databaseHandle =
DBInterfaceFactory.make(currentContext, ManifoldCF.getMasterDatabaseName(),
ManifoldCF.getMasterDatabaseUsername(), ManifoldCF.getMasterDatabasePassword());
dataManager = new DataManager(currentContext,
databaseHandle);
}
if (ciConnector == null) {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Establishing
CIConnector instance.");
}
CIConnector.get().initialize(ciURL);
ciConnector = CIConnector.get();
List<RequestParm> parms = Arrays.asList(new
RequestParm[] {
new RequestParm("user", ""),
new RequestParm("pwd", ""), });
CIResponse ciRsp = ciConnector.sendTxn("login", parms);
ciSID = ((LoginTxnResponse)ciRsp).getSID();
if (ciRsp.isError()) {
String msg = "Login failed, ciURL=" +
ciConnector.getURL() + ", rspXML = " + ciRsp.getXML();
if
(Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug(msg);
}
throw new ManifoldCFException(msg,
ManifoldCFException.REPOSITORY_CONNECTION_ERROR);
} else {
if
(Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Login
succeeded, ciSID=" + ciSID + ", ciURL=" + ciConnector.getURL() + ", rspXML = "
+ ciRsp.getXML());
}
}
}
}
public void install(IThreadContext threadContext) throws
ManifoldCFException {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Installing Connector");
}
IDBInterface mainDatabase =
DBInterfaceFactory.make(threadContext, ManifoldCF.getMasterDatabaseName(),
ManifoldCF.getMasterDatabaseUsername(), ManifoldCF.getMasterDatabasePassword());
dataManager = new DataManager(threadContext, mainDatabase);
mainDatabase.beginTransaction();
try {
dataManager.install();
} catch (ManifoldCFException e) {
mainDatabase.signalRollback();
throw e;
} catch (Error e) {
mainDatabase.signalRollback();
throw e;
} finally {
mainDatabase.endTransaction();
}
}
public void deinstall(IThreadContext threadContext) throws
ManifoldCFException {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Deinstalling Connector");
}
IDBInterface mainDatabase =
DBInterfaceFactory.make(threadContext, ManifoldCF.getMasterDatabaseName(),
ManifoldCF.getMasterDatabaseUsername(), ManifoldCF.getMasterDatabasePassword());
dataManager = new DataManager(threadContext, mainDatabase);
mainDatabase.beginTransaction();
try {
dataManager.deinstall();
} catch (ManifoldCFException e) {
mainDatabase.signalRollback();
throw e;
} catch (Error e) {
mainDatabase.signalRollback();
throw e;
} finally {
mainDatabase.endTransaction();
}
}
public void clearThreadContext() {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Releaseing connection
handle");
}
super.clearThreadContext();
}
public void setThreadContext(IThreadContext threadContext) {
if (Logging.connectors.isEnabledFor(Level.DEBUG)) {
Logging.connectors.debug("Connection handle requested");
}
try {
super.setThreadContext(threadContext);
} catch (ManifoldCFException e) {
e.printStackTrace();
}
// You can put data in the thread context, but this is not a
good design as thread contexts
// will get reassigned to perform other tasks. If you did need
to do such thing, the way to do
// it would be to request the object by the following lines,
where idNum is a static member.
/*
if (currentContext != null) {
Object id = currentContext.get("id");
if (id == null) {
currentContext.save("id", new
Integer(idNum));
idNum++;
} else {
}
}
*/
}
}