[jira] [Commented] (CONNECTORS-1492) GSOC: Add support for Docker

2020-12-09 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246574#comment-17246574
 ] 

Michael Cizmar commented on CONNECTORS-1492:


We've been using Docker Compose for this with a setup script to download the 
necessary files for development purposes.  I agree with you Olivier that a 
reduced set of connectors particularly the Windows File Share (which is going 
to be less strategically important) would be better than no one.  Separately, 
we could be a little more opinionated about the configuration which would be of 
tremendous value to the adoption of the platform.

> GSOC: Add support for Docker
> 
>
> Key: CONNECTORS-1492
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1492
> Project: ManifoldCF
>  Issue Type: New Feature
>Reporter: Piergiorgio Lucidi
>Assignee: Piergiorgio Lucidi
>Priority: Major
>  Labels: devops, docker, gsoc2018
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> This is a project idea for [Google Summer of 
> Code|https://summerofcode.withgoogle.com/] (GSOC).
> To discuss this or other ideas with your potential mentor from the Apache 
> ManifoldCF project, sign up and post to the dev@manifoldcf.apache.org list, 
> including "[GSOC]" in the subject. You may also comment on this Jira issue if 
> you have created an account. 
> We would like to adopt Docker to provide ready to use images with 
> preconfigured architecture stack for ManifoldCF. This will include ManifoldCF 
> itself but also the related database that can be MySQL, PostgreSQL and so on.
> This will help developers to work and put in production a complete ManifoldCF 
> installation.
> You will be involved in the development of the following tasks, you will 
> learn how to:
>  * Write Docker files
>  * Write Docker Compose files
>  * Implement unit tests
>  * Build all the integration tests
>  * Write the documentation for new component
> We have a complete documentation about ManifioldCF:
> [https://manifoldcf.apache.org/release/release-2.9.1/en_US/concepts.html]
> Take a look at our book to understand better the framework and how to extend 
> it in different ways:
> [https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs]
>  
> Prospective GSOC mentor: 
> [piergior...@apache.org|mailto:piergior...@apache.org]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1659) Enable SSL/TLS in Jetty server

2020-11-23 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237830#comment-17237830
 ] 

Michael Cizmar commented on CONNECTORS-1659:


If the Jetty server are starting up without SSL then that is not a code issue 
and most likely related to your configuration.  The guide that I have used is 
no longer on the web but can be found on archive.org:

[https://web.archive.org/web/20200924221213/https://www.eclipse.org/jetty/documentation/current/configuring-ssl.html]

Besides not having SSL, what errors or exceptions do you see in the logs?

> Enable SSL/TLS in Jetty server
> --
>
> Key: CONNECTORS-1659
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1659
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Framework core
>Affects Versions: ManifoldCF 2.16
>Reporter: Jegan Baskaran
>Priority: Minor
>
> Could you please give some idea how to incorporate the SSL/TLS in jettey 
> server. Currently Jetty server is not accepting SSL details.  I had added 
> some SSL details as per the 
> [https://wiki.eclipse.org/Jetty/Howto/Configure_SSL] 
> but it does not work as expected and do we need to change the code in 
> ManifoldCFJettyRunner.java? could you please help on this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-15 Thread Michael Cizmar
Karl,

This occurs on mac and then only on maven.  Both of these are secondary
targets for the build/release process.   I don't know if there's any
difference to doing a follow up RC candidate  because the original build
works on the targeted platforms and as you mentioned, this is just a path
issue with a test and no code has been modified in this connector.

M

On Mon, Sep 14, 2020 at 6:28 AM Karl Wright  wrote:

> I have time this week only to spin a new RC, if that's what the community
> wants, but not to modify the maven build to download ElasticSearch and
> unpack it.  Mind you, there's still no difference in production code
> between RC0, RC1, and what's currently on the branch.  We've been fixing a
> test only.
>
> Please let me know what you feel is necessary for a release to succeed.
> Karl
>
>
> On Sun, Sep 13, 2020 at 6:36 AM Karl Wright  wrote:
>
> > Works fine now for Maven (although I had to upgrade the version of
> > failsafe plugin to work with my current version of Maven), provided you
> run
> > the ant make-dependencies first.
> >
> > Karl
> >
> >
> > On Sat, Sep 12, 2020 at 9:28 PM Karl Wright  wrote:
> >
> >> I used a -D variable that is set differently by maven and ant builds.
> >>
> >> It works fine for ant.  Bandwidth limitations tonight mean I will try
> >> tomorrow morning for maven.
> >>
> >> Karl
> >>
> >>
> >> On Sat, Sep 12, 2020 at 9:26 PM Michael Cizmar <
> mich...@michaelcizmar.com>
> >> wrote:
> >>
> >>> Ok.  What about an environmental variable that is used for the download
> >>> and
> >>> then is read in the test case?
> >>>
> >>> On Sat, Sep 12, 2020 at 7:14 PM Karl Wright 
> wrote:
> >>>
> >>> > Ok, the path change will break the Ant test.  The maven test seems to
> >>> have
> >>> > the current directory set connectors/elasticsearch during testing;
> the
> >>> ant
> >>> > test explicitly sets it to one of the build directories below that.
> >>> But in
> >>> > any case I will need to consider how the test can be changed to use a
> >>> > specific ES source directory; maybe a -D can be pushed into it.
> >>> >
> >>> > Karl
> >>> >
> >>> > On Sat, Sep 12, 2020 at 3:30 PM Michael Cizmar <
> >>> mich...@michaelcizmar.com>
> >>> > wrote:
> >>> >
> >>> > > Two in BaseITHSQLDB's setupElasticSearch method
> >>> > >
> >>> > > First to set the Java_HOME
> >>> > > Map envs = pb.environment();
> >>> > > if (System.getenv("JAVA_HOME")!= null) {
> >>> > > envs.put("JAVA_HOME",System.getenv("JAVA_HOME"));
> >>> > > } else {
> >>> > > throw new Exception("Missing JAVA_HOME as a system environment
> >>> > variable");
> >>> > > }
> >>> > >
> >>> > > The second removing the double dot
> >>> > > if (isUnix) {
> >>> > > pb.command("bash", "-c",
> >>> > > "./test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch
> >>> > > -q -Expack.ml.enabled=false");
> >>> > > System.out.println("Unix process");
> >>> > > } else {
> >>> > > pb.command("cmd.exe", "/c", "..\\test-materials\\windows\\
> >>> > > elasticsearch-7.6.2\\bin\\elasticsearch.bat -q
> >>> > -Expack.ml.enabled=false");
> >>> > > System.out.println("Windows process");
> >>> > > }
> >>> > >
> >>> > >
> >>> > > ===
> >>> > >
> >>> > > ---
> >>> > >
> >>> > >
> >>> >
> >>>
> connector/src/test/java/org/apache/manifoldcf/agents/output/elasticsearch/tests/BaseITHSQLDB.java
> >>> > > (revision
> >>> > > 1881665)
> >>> > >
> >>> > > +++
> >>> > >
> >>> > >
> >>> >
> >>>
> connector/src/test/java/org/apache/manifoldcf/agents/output/elasticsearch/tests/BaseITHSQLDB.java
> >>> > > (working
> >>> > > copy)
> >>> > >
> >>&

Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-12 Thread Michael Cizmar
Ok.  What about an environmental variable that is used for the download and
then is read in the test case?

On Sat, Sep 12, 2020 at 7:14 PM Karl Wright  wrote:

> Ok, the path change will break the Ant test.  The maven test seems to have
> the current directory set connectors/elasticsearch during testing; the ant
> test explicitly sets it to one of the build directories below that.  But in
> any case I will need to consider how the test can be changed to use a
> specific ES source directory; maybe a -D can be pushed into it.
>
> Karl
>
> On Sat, Sep 12, 2020 at 3:30 PM Michael Cizmar 
> wrote:
>
> > Two in BaseITHSQLDB's setupElasticSearch method
> >
> > First to set the Java_HOME
> > Map envs = pb.environment();
> > if (System.getenv("JAVA_HOME")!= null) {
> > envs.put("JAVA_HOME",System.getenv("JAVA_HOME"));
> > } else {
> > throw new Exception("Missing JAVA_HOME as a system environment
> variable");
> > }
> >
> > The second removing the double dot
> > if (isUnix) {
> > pb.command("bash", "-c",
> > "./test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch
> > -q -Expack.ml.enabled=false");
> > System.out.println("Unix process");
> > } else {
> > pb.command("cmd.exe", "/c", "..\\test-materials\\windows\\
> > elasticsearch-7.6.2\\bin\\elasticsearch.bat -q
> -Expack.ml.enabled=false");
> > System.out.println("Windows process");
> > }
> >
> >
> > ===
> >
> > ---
> >
> >
> connector/src/test/java/org/apache/manifoldcf/agents/output/elasticsearch/tests/BaseITHSQLDB.java
> > (revision
> > 1881665)
> >
> > +++
> >
> >
> connector/src/test/java/org/apache/manifoldcf/agents/output/elasticsearch/tests/BaseITHSQLDB.java
> > (working
> > copy)
> >
> > @@ -32,6 +32,7 @@
> >
> >  import org.apache.http.util.EntityUtils;
> >
> >  import org.apache.http.impl.client.HttpClients;
> >
> >  import java.io.IOException;
> >
> > +import java.util.Map;
> >
> >  import java.io.File;
> >
> >
> >
> >
> >
> > @@ -44,11 +45,13 @@
> >
> >  {
> >
> >
> >
> >final static boolean isUnix;
> >
> > +
> >
> >static {
> >
> >  final String os = System.getProperty("os.name").toLowerCase();
> >
> >  if (os.contains("win")) {
> >
> >isUnix = false;
> >
> >  } else {
> >
> > +  //Unix
> >
> >isUnix = true;
> >
> >  }
> >
> >}
> >
> > @@ -84,9 +87,16 @@
> >
> >  final File absFile = new File(".").getAbsoluteFile();
> >
> >  System.out.println("ES working directory is '"+absFile+"'");
> >
> >  pb.directory(absFile);
> >
> > +Map envs = pb.environment();
> >
> > +if (System.getenv("JAVA_HOME")!= null) {
> >
> > +  envs.put("JAVA_HOME",System.getenv("JAVA_HOME"));
> >
> > +} else {
> >
> > +  throw new Exception("Missing JAVA_HOME as a system environment
> > variable");
> >
> > +}
> >
> >
> >
> > +
> >
> >  if (isUnix) {
> >
> > -  pb.command("bash", "-c",
> > "../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch -q
> > -Expack.ml.enabled=false");
> >
> > +  pb.command("bash", "-c",
> > "./test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch
> > -q -Expack.ml.enabled=false");
> >
> >System.out.println("Unix process");
> >
> >  } else {
> >
> >pb.command("cmd.exe", "/c",
> > "..\\test-materials\\windows\\elasticsearch-7.6.2\\bin\\elasticsearch.bat
> > -q -Expack.ml.enabled=false");
> >
> > @@ -93,10 +103,13 @@
> >
> >System.out.println("Windows process");
> >
> >  }
> >
> >
> >
> > +
> >
> > +
> >
> >  File log = new File("es.log");
> >
> >  pb.redirectErrorStream(true);
> >
> >  pb.redirectOutput(ProcessBuilder.Redirect.appendTo(log));
> >
> >  esTestProcess = pb.start();
> >
> > +
> >
> >

Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-12 Thread Michael Cizmar
Two in BaseITHSQLDB's setupElasticSearch method

First to set the Java_HOME
Map envs = pb.environment();
if (System.getenv("JAVA_HOME")!= null) {
envs.put("JAVA_HOME",System.getenv("JAVA_HOME"));
} else {
throw new Exception("Missing JAVA_HOME as a system environment variable");
}

The second removing the double dot
if (isUnix) {
pb.command("bash", "-c",
"./test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch
-q -Expack.ml.enabled=false");
System.out.println("Unix process");
} else {
pb.command("cmd.exe", "/c", "..\\test-materials\\windows\\
elasticsearch-7.6.2\\bin\\elasticsearch.bat -q -Expack.ml.enabled=false");
System.out.println("Windows process");
}


===

---
connector/src/test/java/org/apache/manifoldcf/agents/output/elasticsearch/tests/BaseITHSQLDB.java
(revision
1881665)

+++
connector/src/test/java/org/apache/manifoldcf/agents/output/elasticsearch/tests/BaseITHSQLDB.java
(working
copy)

@@ -32,6 +32,7 @@

 import org.apache.http.util.EntityUtils;

 import org.apache.http.impl.client.HttpClients;

 import java.io.IOException;

+import java.util.Map;

 import java.io.File;





@@ -44,11 +45,13 @@

 {



   final static boolean isUnix;

+

   static {

 final String os = System.getProperty("os.name").toLowerCase();

 if (os.contains("win")) {

   isUnix = false;

 } else {

+  //Unix

   isUnix = true;

 }

   }

@@ -84,9 +87,16 @@

 final File absFile = new File(".").getAbsoluteFile();

 System.out.println("ES working directory is '"+absFile+"'");

 pb.directory(absFile);

+Map envs = pb.environment();

+if (System.getenv("JAVA_HOME")!= null) {

+  envs.put("JAVA_HOME",System.getenv("JAVA_HOME"));

+} else {

+  throw new Exception("Missing JAVA_HOME as a system environment
variable");

+}



+

 if (isUnix) {

-  pb.command("bash", "-c",
"../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch -q
-Expack.ml.enabled=false");

+  pb.command("bash", "-c",
"./test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch
-q -Expack.ml.enabled=false");

   System.out.println("Unix process");

 } else {

   pb.command("cmd.exe", "/c",
"..\\test-materials\\windows\\elasticsearch-7.6.2\\bin\\elasticsearch.bat
-q -Expack.ml.enabled=false");

@@ -93,10 +103,13 @@

   System.out.println("Windows process");

 }



+

+

 File log = new File("es.log");

 pb.redirectErrorStream(true);

 pb.redirectOutput(ProcessBuilder.Redirect.appendTo(log));

 esTestProcess = pb.start();

+

 System.out.println("ElasticSearch is starting...");

 //the default port is 9200





On Sat, Sep 12, 2020 at 2:19 PM Karl Wright  wrote:

> What changes did you make?
> Karl
>
>
> On Sat, Sep 12, 2020 at 2:41 PM Michael Cizmar 
> wrote:
>
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Response from ES: HTTP/1.1 200 OK
> > ES came up!
> > ElasticSearch is started on port 9200
> >
> > @Karl - How would you like this packaged up?
> >
> > On Sat, Sep 12, 2020 at 12:52 PM Michael Cizmar <
> mich...@michaelcizmar.com
> > >
> > wrote:
> >
> > > Agreed.  We want to limit this unpacking because elastic packages them
> > > differently.  I started down the rabbit hole of making a macOS download
> > but
> > > then got into permission issues and started issuing chmod.
> > >
> > > When I get back from lunch I am going to just set the JDK of the
> process
> > > to be the system environment variable and I think that will fix the
> > problem.
> > >
> > > On Sat, Sep 12, 2020 at 12:37 PM Karl Wright 
> wrote:
> > >
> > >> Hi Michael,
> > >>
> > >>
> > >>
> > >> JAVA_HOME is not usually a requirement for Maven building but it's not
> > >>
> > >> unreasonable to have it, especially since maven itself looks for it.
> > >>
> > >>
> > >>
> > >> I suspect that, in order for the Maven build to work, you currently
> need
> > >> to
> > >>
> > >> do this:
> > >>
> > >> - Set JAVA_HOME
> > >>
> > >> - Run the ant build first
> > >>
> > >>
> > >>
> > >> That's a little pain in the butt but we can fix this going forwa

Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-12 Thread Michael Cizmar
Didn't reach ES; waiting...
Didn't reach ES; waiting...
Didn't reach ES; waiting...
Response from ES: HTTP/1.1 200 OK
ES came up!
ElasticSearch is started on port 9200

@Karl - How would you like this packaged up?

On Sat, Sep 12, 2020 at 12:52 PM Michael Cizmar 
wrote:

> Agreed.  We want to limit this unpacking because elastic packages them
> differently.  I started down the rabbit hole of making a macOS download but
> then got into permission issues and started issuing chmod.
>
> When I get back from lunch I am going to just set the JDK of the process
> to be the system environment variable and I think that will fix the problem.
>
> On Sat, Sep 12, 2020 at 12:37 PM Karl Wright  wrote:
>
>> Hi Michael,
>>
>>
>>
>> JAVA_HOME is not usually a requirement for Maven building but it's not
>>
>> unreasonable to have it, especially since maven itself looks for it.
>>
>>
>>
>> I suspect that, in order for the Maven build to work, you currently need
>> to
>>
>> do this:
>>
>> - Set JAVA_HOME
>>
>> - Run the ant build first
>>
>>
>>
>> That's a little pain in the butt but we can fix this going forward - at
>>
>> least the unpacking part.
>>
>> Karl
>>
>>
>>
>>
>>
>> On Sat, Sep 12, 2020 at 1:33 PM Michael Cizmar > >
>>
>> wrote:
>>
>>
>>
>> > I figured it out.  Just working through it.  There was a path issue.
>> When
>>
>> > I start the process it's looking for a JAVA_HOME.  I've got that now
>> set to
>>
>> > the JDK that comes with elastic.   The download of elastic that ant is
>>
>> > doing is specific for linx Linux so that's failing.
>>
>> >
>>
>> > Does the build require JAVA_HOME to be set?   The machine I'm working on
>>
>> > does not have that set.
>>
>> >
>>
>> > On Sat, Sep 12, 2020 at 12:27 PM Karl Wright 
>> wrote:
>>
>> >
>>
>> > > The ant build unpacks the ES binary and puts in the place needed to
>> run.
>>
>> > > My guess is that we need to add similar unpacking to the maven pom.
>> I'll
>>
>> > > see if there is a way to do that.
>>
>> > > Karl
>>
>> > >
>>
>> > >
>>
>> > > On Sat, Sep 12, 2020 at 12:00 PM Michael Cizmar <
>>
>> > mich...@michaelcizmar.com
>>
>> > > >
>>
>> > > wrote:
>>
>> > >
>>
>> > > > I think you are right.  The es.log file contains the following:
>>
>> > > >
>>
>> > > > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch:
>> No
>>
>> > > such
>>
>> > > > file or directory
>>
>> > > > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch:
>> No
>>
>> > > such
>>
>> > > > file or directory
>>
>> > > > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch:
>> No
>>
>> > > such
>>
>> > > > file or directory
>>
>> > > > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch:
>> No
>>
>> > > such
>>
>> > > > file or directory
>>
>> > > > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch:
>> No
>>
>> > > such
>>
>> > > > file or directory
>>
>> > > > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch:
>> No
>>
>> > > such
>>
>> > > > file or directory
>>
>> > > >
>>
>> > > > On Sat, Sep 12, 2020 at 10:49 AM Karl Wright 
>>
>> > wrote:
>>
>> > > >
>>
>> > > > > Ok, I didn't realize this was from Maven only.  It may be a
>> working
>>
>> > > > > directory or dependency version issue.
>>
>> > > > >
>>
>> > > > >
>>
>> > > > > On Sat, Sep 12, 2020 at 11:37 AM Michael Cizmar <
>>
>> > > > mich...@michaelcizmar.com
>>
>> > > > > >
>>
>> > > > > wrote:
>>
>> > > > >
>>
>> > > > > > I reproduced the infinite Elastic Search Loop when building from
>>
>> > > maven.
>>
>> > >

Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-12 Thread Michael Cizmar
I figured it out.  Just working through it.  There was a path issue.  When
I start the process it's looking for a JAVA_HOME.  I've got that now set to
the JDK that comes with elastic.   The download of elastic that ant is
doing is specific for linx Linux so that's failing.

Does the build require JAVA_HOME to be set?   The machine I'm working on
does not have that set.

On Sat, Sep 12, 2020 at 12:27 PM Karl Wright  wrote:

> The ant build unpacks the ES binary and puts in the place needed to run.
> My guess is that we need to add similar unpacking to the maven pom.  I'll
> see if there is a way to do that.
> Karl
>
>
> On Sat, Sep 12, 2020 at 12:00 PM Michael Cizmar  >
> wrote:
>
> > I think you are right.  The es.log file contains the following:
> >
> > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No
> such
> > file or directory
> > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No
> such
> > file or directory
> > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No
> such
> > file or directory
> > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No
> such
> > file or directory
> > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No
> such
> > file or directory
> > bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No
> such
> > file or directory
> >
> > On Sat, Sep 12, 2020 at 10:49 AM Karl Wright  wrote:
> >
> > > Ok, I didn't realize this was from Maven only.  It may be a working
> > > directory or dependency version issue.
> > >
> > >
> > > On Sat, Sep 12, 2020 at 11:37 AM Michael Cizmar <
> > mich...@michaelcizmar.com
> > > >
> > > wrote:
> > >
> > > > I reproduced the infinite Elastic Search Loop when building from
> maven.
> > > > I'm investigating it now.
> > > >
> > > > On Sat, Sep 12, 2020 at 9:30 AM Michael Cizmar <
> > > mich...@michaelcizmar.com>
> > > > wrote:
> > > >
> > > > > I'll take a look at the build this AM.
> > > > >
> > > > > On Sat, Sep 12, 2020 at 5:13 AM Karl Wright 
> > > wrote:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> I don't have a Mac, so if we can't figure out how to get ES to
> start
> > > > >> properly on a Mac, I have no ability to debug it myself.  I would
> be
> > > > >> forced
> > > > >> to recommend we just disable the test - or continue with the vote,
> > > since
> > > > >> nothing has changed and we released it this way for the last
> release
> > > in
> > > > >> April.  We have people waiting for the Postgresql updates.
> > > > >>
> > > > >> Karl
> > > > >>
> > > > >>
> > > > >> On Wed, Sep 9, 2020 at 11:54 AM Cihad Guzel 
> > > wrote:
> > > > >>
> > > > >> > Michael,
> > > > >> >
> > > > >> > I have log lines repeating like as follow:
> > > > >> >
> > > > >> > ---
> > > > >> >  T E S T S
> > > > >> > ---
> > > > >> > Running
> > > > >> >
> > > > >>
> > > >
> > org.apache.manifoldcf.agents.output.elasticsearch.tests.APISanityHSQLDBIT
> > > > >> > Configuration file successfully read
> > > > >> > [main] INFO org.eclipse.jetty.util.log - Logging initialized
> > @7246ms
> > > > >> > [main] INFO org.eclipse.jetty.server.Server -
> > jetty-9.2.3.v20140905
> > > > >> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> > > Started
> > > > >> > o.e.j.w.WebAppContext@2ad48653
> > > > >> >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> {/mcf-crawler-ui,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-crawler-ui.war-_mcf-crawler-ui-any-5325495261063795321.dir/webapp/,AVAILABLE}{../dependency/mcf-crawler-ui.war}
> > > > >> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> > > Started
> > > > >> > o.e.j.w.WebAppContext@6bb4dd34
> 

Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-12 Thread Michael Cizmar
I think you are right.  The es.log file contains the following:

bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No such
file or directory
bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No such
file or directory
bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No such
file or directory
bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No such
file or directory
bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No such
file or directory
bash: ../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch: No such
file or directory

On Sat, Sep 12, 2020 at 10:49 AM Karl Wright  wrote:

> Ok, I didn't realize this was from Maven only.  It may be a working
> directory or dependency version issue.
>
>
> On Sat, Sep 12, 2020 at 11:37 AM Michael Cizmar  >
> wrote:
>
> > I reproduced the infinite Elastic Search Loop when building from maven.
> > I'm investigating it now.
> >
> > On Sat, Sep 12, 2020 at 9:30 AM Michael Cizmar <
> mich...@michaelcizmar.com>
> > wrote:
> >
> > > I'll take a look at the build this AM.
> > >
> > > On Sat, Sep 12, 2020 at 5:13 AM Karl Wright 
> wrote:
> > >
> > >> Hi all,
> > >>
> > >> I don't have a Mac, so if we can't figure out how to get ES to start
> > >> properly on a Mac, I have no ability to debug it myself.  I would be
> > >> forced
> > >> to recommend we just disable the test - or continue with the vote,
> since
> > >> nothing has changed and we released it this way for the last release
> in
> > >> April.  We have people waiting for the Postgresql updates.
> > >>
> > >> Karl
> > >>
> > >>
> > >> On Wed, Sep 9, 2020 at 11:54 AM Cihad Guzel 
> wrote:
> > >>
> > >> > Michael,
> > >> >
> > >> > I have log lines repeating like as follow:
> > >> >
> > >> > ---
> > >> >  T E S T S
> > >> > ---
> > >> > Running
> > >> >
> > >>
> > org.apache.manifoldcf.agents.output.elasticsearch.tests.APISanityHSQLDBIT
> > >> > Configuration file successfully read
> > >> > [main] INFO org.eclipse.jetty.util.log - Logging initialized @7246ms
> > >> > [main] INFO org.eclipse.jetty.server.Server - jetty-9.2.3.v20140905
> > >> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Started
> > >> > o.e.j.w.WebAppContext@2ad48653
> > >> >
> > >> >
> > >>
> >
> {/mcf-crawler-ui,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-crawler-ui.war-_mcf-crawler-ui-any-5325495261063795321.dir/webapp/,AVAILABLE}{../dependency/mcf-crawler-ui.war}
> > >> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Started
> > >> > o.e.j.w.WebAppContext@6bb4dd34
> > >> >
> > >> >
> > >>
> >
> {/mcf-authority-service,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-authority-service.war-_mcf-authority-service-any-1339291969162319913.dir/webapp/,AVAILABLE}{../dependency/mcf-authority-service.war}
> > >> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Started
> > >> > o.e.j.w.WebAppContext@7d9f158f
> > >> >
> > >> >
> > >>
> >
> {/mcf-api-service,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-api-service.war-_mcf-api-service-any-2172701003493912621.dir/webapp/,AVAILABLE}{../dependency/mcf-api-service.war}
> > >> > [main] INFO org.eclipse.jetty.server.ServerConnector - Started
> > >> > ServerConnector@2796aeae{HTTP/1.1}{0.0.0.0:8346}
> > >> > [main] INFO org.eclipse.jetty.server.Server - Started @10899ms
> > >> > ES working directory is
> > >> >
> > >> >
> > >>
> >
> '/Users/cguzel/Projects/apache/svn/release-2.17-RC1/connectors/elasticsearch/target/test-output/.'
> > >> > Unix process
> > >> > ElasticSearch is starting...
> > >> > Didn't reach ES; waiting...
> > >> > Didn't reach ES; waiting...
> > >> > Didn't reach ES; waiting...
> > >> > Didn't reach ES; waiting...
> > >> > Didn't reach ES; waiting...
> > >> > Didn't reach ES; waiting...

Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-12 Thread Michael Cizmar
I reproduced the infinite Elastic Search Loop when building from maven.
I'm investigating it now.

On Sat, Sep 12, 2020 at 9:30 AM Michael Cizmar 
wrote:

> I'll take a look at the build this AM.
>
> On Sat, Sep 12, 2020 at 5:13 AM Karl Wright  wrote:
>
>> Hi all,
>>
>> I don't have a Mac, so if we can't figure out how to get ES to start
>> properly on a Mac, I have no ability to debug it myself.  I would be
>> forced
>> to recommend we just disable the test - or continue with the vote, since
>> nothing has changed and we released it this way for the last release in
>> April.  We have people waiting for the Postgresql updates.
>>
>> Karl
>>
>>
>> On Wed, Sep 9, 2020 at 11:54 AM Cihad Guzel  wrote:
>>
>> > Michael,
>> >
>> > I have log lines repeating like as follow:
>> >
>> > ---
>> >  T E S T S
>> > ---
>> > Running
>> >
>> org.apache.manifoldcf.agents.output.elasticsearch.tests.APISanityHSQLDBIT
>> > Configuration file successfully read
>> > [main] INFO org.eclipse.jetty.util.log - Logging initialized @7246ms
>> > [main] INFO org.eclipse.jetty.server.Server - jetty-9.2.3.v20140905
>> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Started
>> > o.e.j.w.WebAppContext@2ad48653
>> >
>> >
>> {/mcf-crawler-ui,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-crawler-ui.war-_mcf-crawler-ui-any-5325495261063795321.dir/webapp/,AVAILABLE}{../dependency/mcf-crawler-ui.war}
>> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Started
>> > o.e.j.w.WebAppContext@6bb4dd34
>> >
>> >
>> {/mcf-authority-service,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-authority-service.war-_mcf-authority-service-any-1339291969162319913.dir/webapp/,AVAILABLE}{../dependency/mcf-authority-service.war}
>> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Started
>> > o.e.j.w.WebAppContext@7d9f158f
>> >
>> >
>> {/mcf-api-service,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-api-service.war-_mcf-api-service-any-2172701003493912621.dir/webapp/,AVAILABLE}{../dependency/mcf-api-service.war}
>> > [main] INFO org.eclipse.jetty.server.ServerConnector - Started
>> > ServerConnector@2796aeae{HTTP/1.1}{0.0.0.0:8346}
>> > [main] INFO org.eclipse.jetty.server.Server - Started @10899ms
>> > ES working directory is
>> >
>> >
>> '/Users/cguzel/Projects/apache/svn/release-2.17-RC1/connectors/elasticsearch/target/test-output/.'
>> > Unix process
>> > ElasticSearch is starting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> > Didn't reach ES; waiting...
>> >
>> > Cihad
>> >
>> >
>> > Michael Cizmar , 9 Eyl 2020 Çar, 12:40
>> > tarihinde
>> > şunu yazdı:
>> >
>> > > Cihad,
>> > >
>> > > What was the error that you received when compiling in maven?
>> > >
>> > > On Wed, Sep 9, 2020 at 3:39 AM Cihad Guzel  wrote:
>> > >
>> > > > Hi Karl,
>> > > >
>> > > > I have successfully compiled using ant build. I tried to compile the
>> > > tag[1]
>> > > > using maven, but the ElasticSearch tests still fail using Mac [2] .
>> > > > <https://issues.apache.org/jira/browse/CONNECTORS-1651>
>> > > >
>> > > > [1]
>> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.17-RC1
>> > > > [2] https://issues.apache.org/jira/browse/CONNECTORS-1651
>> > > >
>> > > > Regards,
>> > > > Cihad Güzel
>> > > >
>> > > >
>> > > > Karl Wright , 8 Eyl 2020 Sal, 16:32 tarihinde
>> şunu
>> > > > yazdı:
>> > > >
>> > > > > Tests pass.  +1 from me.
>> > > > >
>> > > > > Looking for a few other voters?
>> > > > >
>> > > > > Karl
>> > > > >
>> > > > >
>> > > > > On Sat, Sep 5, 2020 at 6:32 AM Karl Wright 
>> > wrote:
>> > > > >
>> > > > > > Please vote on whether to release Apache ManifoldCF 2.17, RC1.
>> The
>> > > > > > release artifact can be found here:
>> > > > > >
>> > > > > >
>> > > >
>> >
>> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.17
>> > > > > >
>> > > > > > There is also a release tag at:
>> > > > > >
>> > > > > >
>> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.17-RC1
>> > > > > >
>> > > > > > This release does not contain anything major - just a few bug
>> > fixes,
>> > > > > > summarized in the CHANGES.txt file.  It does include
>> documentation,
>> > > > > > however, which did not get successfully built for the 2.16
>> release.
>> > > > > Please
>> > > > > > review carefully with that in mind.
>> > > > > >
>> > > > > > The respin was required because the ElasticSearch test did not
>> > > properly
>> > > > > > work on the Mac.
>> > > > > >
>> > > > > > Thanks!
>> > > > > > Karl
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>


Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-12 Thread Michael Cizmar
I'll take a look at the build this AM.

On Sat, Sep 12, 2020 at 5:13 AM Karl Wright  wrote:

> Hi all,
>
> I don't have a Mac, so if we can't figure out how to get ES to start
> properly on a Mac, I have no ability to debug it myself.  I would be forced
> to recommend we just disable the test - or continue with the vote, since
> nothing has changed and we released it this way for the last release in
> April.  We have people waiting for the Postgresql updates.
>
> Karl
>
>
> On Wed, Sep 9, 2020 at 11:54 AM Cihad Guzel  wrote:
>
> > Michael,
> >
> > I have log lines repeating like as follow:
> >
> > ---
> >  T E S T S
> > ---
> > Running
> > org.apache.manifoldcf.agents.output.elasticsearch.tests.APISanityHSQLDBIT
> > Configuration file successfully read
> > [main] INFO org.eclipse.jetty.util.log - Logging initialized @7246ms
> > [main] INFO org.eclipse.jetty.server.Server - jetty-9.2.3.v20140905
> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Started
> > o.e.j.w.WebAppContext@2ad48653
> >
> >
> {/mcf-crawler-ui,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-crawler-ui.war-_mcf-crawler-ui-any-5325495261063795321.dir/webapp/,AVAILABLE}{../dependency/mcf-crawler-ui.war}
> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Started
> > o.e.j.w.WebAppContext@6bb4dd34
> >
> >
> {/mcf-authority-service,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-authority-service.war-_mcf-authority-service-any-1339291969162319913.dir/webapp/,AVAILABLE}{../dependency/mcf-authority-service.war}
> > [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Started
> > o.e.j.w.WebAppContext@7d9f158f
> >
> >
> {/mcf-api-service,file:/private/var/folders/gw/4lgs06cd065d09gnm6ythcp8gn/T/jetty-0.0.0.0-8346-mcf-api-service.war-_mcf-api-service-any-2172701003493912621.dir/webapp/,AVAILABLE}{../dependency/mcf-api-service.war}
> > [main] INFO org.eclipse.jetty.server.ServerConnector - Started
> > ServerConnector@2796aeae{HTTP/1.1}{0.0.0.0:8346}
> > [main] INFO org.eclipse.jetty.server.Server - Started @10899ms
> > ES working directory is
> >
> >
> '/Users/cguzel/Projects/apache/svn/release-2.17-RC1/connectors/elasticsearch/target/test-output/.'
> > Unix process
> > ElasticSearch is starting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> > Didn't reach ES; waiting...
> >
> > Cihad
> >
> >
> > Michael Cizmar , 9 Eyl 2020 Çar, 12:40
> > tarihinde
> > şunu yazdı:
> >
> > > Cihad,
> > >
> > > What was the error that you received when compiling in maven?
> > >
> > > On Wed, Sep 9, 2020 at 3:39 AM Cihad Guzel  wrote:
> > >
> > > > Hi Karl,
> > > >
> > > > I have successfully compiled using ant build. I tried to compile the
> > > tag[1]
> > > > using maven, but the ElasticSearch tests still fail using Mac [2] .
> > > > <https://issues.apache.org/jira/browse/CONNECTORS-1651>
> > > >
> > > > [1]
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.17-RC1
> > > > [2] https://issues.apache.org/jira/browse/CONNECTORS-1651
> > > >
> > > > Regards,
> > > > Cihad Güzel
> > > >
> > > >
> > > > Karl Wright , 8 Eyl 2020 Sal, 16:32 tarihinde
> şunu
> > > > yazdı:
> > > >
> > > > > Tests pass.  +1 from me.
> > > > >
> > > > > Looking for a few other voters?
> > > > >
> > > > > Karl
> > > > >
> > > > >
> > > > > On Sat, Sep 5, 2020 at 6:32 AM Karl Wright 
> > wrote:
> > > > >
> > > > > > Please vote on whether to release Apache ManifoldCF 2.17, RC1.
> The
> > > > > > release artifact can be found here:
> > > > > >
> > > > > >
> > > >
> > https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.17
> > > > > >
> > > > > > There is also a release tag at:
> > > > > >
> > > > > >
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.17-RC1
> > > > > >
> > > > > > This release does not contain anything major - just a few bug
> > fixes,
> > > > > > summarized in the CHANGES.txt file.  It does include
> documentation,
> > > > > > however, which did not get successfully built for the 2.16
> release.
> > > > > Please
> > > > > > review carefully with that in mind.
> > > > > >
> > > > > > The respin was required because the ElasticSearch test did not
> > > properly
> > > > > > work on the Mac.
> > > > > >
> > > > > > Thanks!
> > > > > > Karl
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Release Apache ManifoldCF 2.17, RC1

2020-09-09 Thread Michael Cizmar
Cihad,

What was the error that you received when compiling in maven?

On Wed, Sep 9, 2020 at 3:39 AM Cihad Guzel  wrote:

> Hi Karl,
>
> I have successfully compiled using ant build. I tried to compile the tag[1]
> using maven, but the ElasticSearch tests still fail using Mac [2] .
> 
>
> [1] https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.17-RC1
> [2] https://issues.apache.org/jira/browse/CONNECTORS-1651
>
> Regards,
> Cihad Güzel
>
>
> Karl Wright , 8 Eyl 2020 Sal, 16:32 tarihinde şunu
> yazdı:
>
> > Tests pass.  +1 from me.
> >
> > Looking for a few other voters?
> >
> > Karl
> >
> >
> > On Sat, Sep 5, 2020 at 6:32 AM Karl Wright  wrote:
> >
> > > Please vote on whether to release Apache ManifoldCF 2.17, RC1.  The
> > > release artifact can be found here:
> > >
> > >
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.17
> > >
> > > There is also a release tag at:
> > >
> > > https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.17-RC1
> > >
> > > This release does not contain anything major - just a few bug fixes,
> > > summarized in the CHANGES.txt file.  It does include documentation,
> > > however, which did not get successfully built for the 2.16 release.
> > Please
> > > review carefully with that in mind.
> > >
> > > The respin was required because the ElasticSearch test did not properly
> > > work on the Mac.
> > >
> > > Thanks!
> > > Karl
> > >
> > >
> > >
> >
>


[jira] [Commented] (CONNECTORS-1651) ElasticSearch server is not starting during integration test

2020-09-01 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188934#comment-17188934
 ] 

Michael Cizmar commented on CONNECTORS-1651:


I do not particularly like how Elastic is running in the background here but 
the machine learning component is not required for anything that we are doing 
and can be turned off by doing the following:

 

 
{code:java}
if (isUnix) {  
  pb.command("bash", "-c", 
"../test-materials/unix/elasticsearch-7.6.2/bin/elasticsearch -q 
-Expack.ml.enabled=false");
  System.out.println("Unix process");
} else {  
  pb.command("cmd.exe", "/c", 
"..\\test-materials\\windows\\elasticsearch-7.6.2\\bin\\elasticsearch.bat -q 
-Expack.ml.enabled=false"); 
  System.out.println("Windows process");
}
{code}
or by copying a Elasticsearch.yml file into the conf directory with that line.

Either option should turn off machine learning for xpack.

 

> ElasticSearch server is not starting during integration test
> 
>
> Key: CONNECTORS-1651
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1651
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Elastic Search connector
>Reporter: Piergiorgio Lucidi
>Assignee: Piergiorgio Lucidi
>Priority: Major
>
> Trying to run the test suite on my Mac (with JDK 1.8 but also with JDK 11), 
> ElasticSearch server is not starting properly:
> {noformat}
> [INFO] -< org.apache.manifoldcf:mcf-elasticsearch-connector 
> >--
> [INFO] Building ManifoldCF - Connectors - ElasticSearch 2.17            
> [39/64]
> [INFO] [ jar 
> ]-
> [INFO]
> [INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @ 
> mcf-elasticsearch-connector ---
> [INFO]
> [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
> mcf-elasticsearch-connector ---
> [INFO]
> [INFO] --- maven-dependency-plugin:2.8:copy (copy-war) @ 
> mcf-elasticsearch-connector ---
> [INFO] Configured Artifact: org.apache.manifoldcf:mcf-api-service:2.17:war
> [INFO] Configured Artifact: 
> org.apache.manifoldcf:mcf-authority-service:2.17:war
> [INFO] Configured Artifact: org.apache.manifoldcf:mcf-crawler-ui:2.17:war
> [INFO] Copying mcf-api-service-2.17.war to 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/dependency/mcf-api-service.war
> [INFO] Copying mcf-authority-service-2.17.war to 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/dependency/mcf-authority-service.war
> [INFO] Copying mcf-crawler-ui-2.17.war to 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/dependency/mcf-crawler-ui.war
> [INFO]
> [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ 
> mcf-elasticsearch-connector ---
> [debug] execute contextualize
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 5 resources
> [INFO] Copying 4 resources
> [INFO] Copying 3 resources
> [INFO]
> [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ 
> mcf-elasticsearch-connector ---
> [INFO] Compiling 8 source files to 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes
> [INFO]
> [INFO] --- native2ascii-maven-plugin:1.0-beta-1:native2ascii 
> (native2ascii-utf8) @ mcf-elasticsearch-connector ---
> [INFO] Includes: [**/*.properties]
> [INFO] Excludes: []
> [INFO] Processing 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_en_US.properties
> [INFO] Processing 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_ja_JP.properties
> [INFO] Processing 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_zh_CN.properties
> [INFO] Processing 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_fr_FR.properties
> [INFO] Processing 
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_es_ES.properties
> [INFO]
> [INFO] --- maven-resource

Re: [VOTE] Release Apache ManifoldCF 2.17, RC0

2020-08-30 Thread Michael Cizmar
>From what I see it does not appear that the tests are failing.  The Elastic
search container is not starting.  I agree with Karl.

On Sat, Aug 29, 2020 at 9:24 AM Karl Wright  wrote:

> Hmm, the way it starts the process is the same on Windows and Linux.  The
> version of ES we download for the test is the Linux distribution, so I am
> surprised that it's not actually working on Linux.  Maybe the environment
> variables are incorrect for that?
>
> Since it passes on Windows, I think this should not be a blocker.  But we
> should open a ticket and investigate the issue.  Would you like to create
> that ticket?
>
> Karl
>
>
> On Sat, Aug 29, 2020 at 10:08 AM Piergiorgio Lucidi <
> piergior...@apache.org>
> wrote:
>
> > Hi,
> >
> > -1 from me, it seems that the ElasticSearch tests are failing using Mac
> on
> > both JDK 8 and JDK 11:
> >
> > [INFO] -< org.apache.manifoldcf:mcf-elasticsearch-connector
> > > >--
> > > [INFO] Building ManifoldCF - Connectors - ElasticSearch 2.17
> > >  [39/64]
> > > [INFO] [ jar
> > > ]-
> > > [INFO]
> > > [INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @
> > > mcf-elasticsearch-connector ---
> > > [INFO]
> > > [INFO] --- maven-remote-resources-plugin:1.5:process (default) @
> > > mcf-elasticsearch-connector ---
> > > [INFO]
> > > [INFO] --- maven-dependency-plugin:2.8:copy (copy-war) @
> > > mcf-elasticsearch-connector ---
> > > [INFO] Configured Artifact:
> > org.apache.manifoldcf:mcf-api-service:2.17:war
> > > [INFO] Configured Artifact:
> > > org.apache.manifoldcf:mcf-authority-service:2.17:war
> > > [INFO] Configured Artifact:
> org.apache.manifoldcf:mcf-crawler-ui:2.17:war
> > > [INFO] Copying mcf-api-service-2.17.war to
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/dependency/mcf-api-service.war
> > > [INFO] Copying mcf-authority-service-2.17.war to
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/dependency/mcf-authority-service.war
> > > [INFO] Copying mcf-crawler-ui-2.17.war to
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/dependency/mcf-crawler-ui.war
> > > [INFO]
> > > [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @
> > > mcf-elasticsearch-connector ---
> > > [debug] execute contextualize
> > > [INFO] Using 'UTF-8' encoding to copy filtered resources.
> > > [INFO] Copying 5 resources
> > > [INFO] Copying 4 resources
> > > [INFO] Copying 3 resources
> > > [INFO]
> > > [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @
> > > mcf-elasticsearch-connector ---
> > > [INFO] Compiling 8 source files to
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes
> > > [INFO]
> > > [INFO] --- native2ascii-maven-plugin:1.0-beta-1:native2ascii
> > > (native2ascii-utf8) @ mcf-elasticsearch-connector ---
> > > [INFO] Includes: [**/*.properties]
> > > [INFO] Excludes: []
> > > [INFO] Processing
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_en_US.properties
> > > [INFO] Processing
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_ja_JP.properties
> > > [INFO] Processing
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_zh_CN.properties
> > > [INFO] Processing
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_fr_FR.properties
> > > [INFO] Processing
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/classes/org/apache/manifoldcf/agents/output/elasticsearch/common_es_ES.properties
> > > [INFO]
> > > [INFO] --- maven-resources-plugin:2.5:testResources
> > > (default-testResources) @ mcf-elasticsearch-connector ---
> > > [debug] execute contextualize
> > > [INFO] Using 'UTF-8' encoding to copy filtered resources.
> > > [INFO] skip non existing resourceDirectory
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/connector/src/test/resources
> > > [INFO] Copying 3 resources
> > > [INFO]
> > > [INFO] --- maven-compiler-plugin:2.3.2:testCompile
> (default-testCompile)
> > @
> > > mcf-elasticsearch-connector ---
> > > [INFO] Compiling 6 source files to
> > >
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.17/connectors/elasticsearch/target/test-classes
> > > [INFO]
> > > [INFO] --- maven-surefire-plugin:2.17:test (default-test) @
> > > mcf-elasticsearch-connector ---
> > > [INFO]
> > 

Re: Document Splitter

2020-07-08 Thread Michael Cizmar
Cool.  I'll shift to that approach.  Have a lot of cases were we are indexing a 
csv, xml, or json file where we want them split up.


--

Michael Cizmar
Managing Director

p: 312.585.6396

d: 312.585.6286
twitter: @michaelcizmar<http://twitter.com/michaelcizmar>

http://www.mcplusa.com/


The information contained in this communication is confidential, private, 
proprietary, or otherwise privileged and is intended only for the use of the 
addressee.  This e-mail is intended only for the person or entity to whom it is 
directed.  Unauthorized use, disclosure, distribution or copying is strictly 
prohibited and may be unlawful.  If you are not the intended recipient, please 
notify us immediately and permanently delete this e-mail and any attachments.


From: Karl Wright 
Sent: Wednesday, July 8, 2020 4:43 PM
To: dev 
Subject: Re: Document Splitter

Hi all,
Julien is correct; all documents must originate in the document
repository.  You can create document components this way, but they're all
subsidiaries of the principle document, so really the framework only tracks
the principle document in that case.

So you have a choice: either use the component approach, or have each row
be a full document in its own right.

>From what I see, the component approach would be the best one.

Karl


On Wed, Jul 8, 2020 at 1:25 PM Michael Cizmar 
wrote:

> Good point, I was thinking that I could do a:
> return activities.sendDocument(documentURI,docCopy);
>
> For each row of the XML or JSON.
>
>
>
> 
> From: julien.massi...@francelabs.com 
> Sent: Wednesday, July 8, 2020 9:45 AM
> To: dev@manifoldcf.apache.org 
> Subject: RE: Document Splitter
>
> Hi Michael,
>
> if I am not wrong (and that Karl confirms), what you want to do is not
> possible in a transformation connector. A transformation connector cannot
> transform 1 incoming document into several ones. The only way to do that is
> in a repository connector but it would then be bound to the type of the
> repo source.
>
> Regards,
> Julien
>
> -Message d'origine-
> De : Karl Wright 
> Envoyé : mercredi 8 juillet 2020 16:16
> À : dev 
> Objet : Re: Document Splitter
>
> Not that I know of.  But I'll let others answer as to what they may have
> written.
> Karl
>
>
> On Tue, Jul 7, 2020 at 7:38 PM Michael Cizmar 
> wrote:
>
> > I have a Json file which has an array of objects that I want to index
> > as separate documents.  Before I build a transformer to split it, is
> > there a ready made transformer to do this?
> >
> > Thanks!
> >
> > Michael
> >
>
>


Re: Document Splitter

2020-07-08 Thread Michael Cizmar
Good point, I was thinking that I could do a:
return activities.sendDocument(documentURI,docCopy);

For each row of the XML or JSON.




From: julien.massi...@francelabs.com 
Sent: Wednesday, July 8, 2020 9:45 AM
To: dev@manifoldcf.apache.org 
Subject: RE: Document Splitter

Hi Michael,

if I am not wrong (and that Karl confirms), what you want to do is not possible 
in a transformation connector. A transformation connector cannot transform 1 
incoming document into several ones. The only way to do that is in a repository 
connector but it would then be bound to the type of the repo source.

Regards,
Julien

-Message d'origine-
De : Karl Wright 
Envoyé : mercredi 8 juillet 2020 16:16
À : dev 
Objet : Re: Document Splitter

Not that I know of.  But I'll let others answer as to what they may have 
written.
Karl


On Tue, Jul 7, 2020 at 7:38 PM Michael Cizmar 
wrote:

> I have a Json file which has an array of objects that I want to index
> as separate documents.  Before I build a transformer to split it, is
> there a ready made transformer to do this?
>
> Thanks!
>
> Michael
>



[jira] [Commented] (CONNECTORS-1648) PostgreSQL 10,11 and 12 support

2020-07-07 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153142#comment-17153142
 ] 

Michael Cizmar commented on CONNECTORS-1648:


No.  I can give that a go in the next week or so and report back.

> PostgreSQL 10,11 and 12 support
> ---
>
> Key: CONNECTORS-1648
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1648
> Project: ManifoldCF
>  Issue Type: Improvement
>Affects Versions: ManifoldCF 2.11, ManifoldCF 2.12, ManifoldCF 2.13, 
> ManifoldCF 2.14, ManifoldCF 2.15, ManifoldCF 2.16
>Reporter: DK
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.17
>
>
> As per the current documentation, "ManifoldCF has been tested against version 
> 8.3.7, 8.4.5 and 9.1 of PostgreSQL. "
> 9.1 release date is Sep 12 , 2011 and EOL on Oct 27 2016 already.
> No support for newer versions.
> This is important to deploy ManifoldCF in an enterprise environment where 
> value added service such as HA, Backup and Recovery and Monitoring etc are 
> provided by third party vendors. These vendors do not support postgreSql 
> versions which are reaching end of life.
> Any plan to test and certify recent versions such as 10,11 and 12.3?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Document Splitter

2020-07-07 Thread Michael Cizmar
I have a Json file which has an array of objects that I want to index as
separate documents.  Before I build a transformer to split it, is there a
ready made transformer to do this?

Thanks!

Michael


[jira] [Commented] (CONNECTORS-1648) PostgreSQL 10,11 and 12 support

2020-07-06 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152340#comment-17152340
 ] 

Michael Cizmar commented on CONNECTORS-1648:


It works with new versions.  I think we've run it on 10 and 11 recently.  I 
would like this updated as well.

> PostgreSQL 10,11 and 12 support
> ---
>
> Key: CONNECTORS-1648
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1648
> Project: ManifoldCF
>  Issue Type: Improvement
>Affects Versions: ManifoldCF 2.11, ManifoldCF 2.12, ManifoldCF 2.13, 
> ManifoldCF 2.14, ManifoldCF 2.15, ManifoldCF 2.16
>Reporter: DK
>Priority: Major
>
> As per the current documentation, "ManifoldCF has been tested against version 
> 8.3.7, 8.4.5 and 9.1 of PostgreSQL. "
> 9.1 release date is Sep 12 , 2011 and EOL on Oct 27 2016 already.
> No support for newer versions.
> This is important to deploy ManifoldCF in an enterprise environment where 
> value added service such as HA, Backup and Recovery and Monitoring etc are 
> provided by third party vendors. These vendors do not support postgreSql 
> versions which are reaching end of life.
> Any plan to test and certify recent versions such as 10,11 and 12.3?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1642) PostgreSQL Version >= 12.2 DB Initialization Problems

2020-05-12 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105371#comment-17105371
 ] 

Michael Cizmar commented on CONNECTORS-1642:


Is sort of a latest version issue?  We got ManifoldCF working with with 10 and 
maybe 11.  

> PostgreSQL Version >= 12.2 DB Initialization Problems
> -
>
> Key: CONNECTORS-1642
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1642
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Framework core
>Affects Versions: ManifoldCF 2.15
>Reporter: Uwe Wolfinger
>Assignee: Karl Wright
>Priority: Major
>
> when trying to run the "./executecommand.sh 
> org.apache.manifoldcf.crawler.InitializeAndRegister" script, the following 
> erro shows up and the initialization process stops:
> {{ WARNING: Illegal reflective access by org.postgresql.jdbc.TimestampUtils 
> ([file:/home/suche/crawler/lib/postgresql-42.1.3.jar|file:///home/suche/crawler/lib/postgresql-42.1.3.jar])
>  to field java.util.TimeZone.defaultTimeZone}}
> {{ WARNING: Please consider reporting this to the maintainers of 
> org.postgresql.jdbc.TimestampUtils}}
> {{ WARNING: Use --illegal-access=warn to enable warnings of further illegal 
> reflective access operations}}
> {{ WARNING: All illegal access operations will be denied in a future release}}
> {{ org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database 
> exception: SQLException doing query (42703): FEHLER: Spalte pg_attrdef.adsrc 
> existiert nicht}}
> {{ Position: 447}}
> {{ at 
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.finishUp(Database.java:715)}}
> {{ at 
> org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:741)}}
> {{ at 
> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:803)}}
> {{ at 
> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)}}
> {{ at 
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)}}
> {{ at 
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)}}
> {{ at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)}}
> {{ at 
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.getTableSchema(DBInterfacePostgreSQL.java:696)}}
> {{ at 
> org.apache.manifoldcf.core.database.BaseTable.getTableSchema(BaseTable.java:185)}}
> {{ at 
> org.apache.manifoldcf.agents.agentmanager.AgentManager.install(AgentManager.java:67)}}
> {{ at 
> org.apache.manifoldcf.agents.system.ManifoldCF.installTables(ManifoldCF.java:112)}}
>  
> the column "pg_attrdef.adsrc" no longer exists in PostgreSQL DB 12.2.
> [https://www.postgresql.org/docs/11/catalog-pg-attrdef.html]
> which means that it is impossible to initialize the core DB in a PostgreSQL  
> 12.2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: MongoDB download now no longer works

2020-05-05 Thread Michael Cizmar
Looks like that is using version 2.1.2-SNAPSHOT.  Key part I think is 
"Snapshot".  I did a search and found this one (and there is a 2.1.2 version):

https://search.maven.org/artifact/de.flapdoodle.embed/de.flapdoodle.embed.mongo/2.2.0/jar

and it appears to be available for download still:

https://repo1.maven.org/maven2/de/flapdoodle/embed/de.flapdoodle.embed.mongo/2.2.0/

-- 
Michael Cizmar


On 5/5/20, 8:55 AM, "Karl Wright"  wrote:

A unit test for the MongoDB connector needs to download a version of
MongoDB to do this testing.  Unfortunately it looks like MongoDB has
removed their old, free versions from the download repository.  We get this
now:

C:\wip\mcf\trunk\connectors\mongodb>ant download-dependencies
Buildfile: C:\wip\mcf\trunk\connectors\mongodb\build.xml

download-dependencies:
  [get] Getting:

https://oss.sonatype.org/content/repositories/snapshots/de/flapdoodle/embed/de.flapdoodle.embed.mongo/2.1.2-SNAPSHOT/de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar
  [get] To:

C:\wip\mcf\trunk\connectors\mongodb\test-materials\de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar
  [get] Error opening connection java.io.FileNotFoundException:

https://oss.sonatype.org/content/repositories/snapshots/de/flapdoodle/embed/de.flapdoodle.embed.mongo/2.1.2-SNAPSHOT/de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar
  [get] Error opening connection java.io.FileNotFoundException:

https://oss.sonatype.org/content/repositories/snapshots/de/flapdoodle/embed/de.flapdoodle.embed.mongo/2.1.2-SNAPSHOT/de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar
  [get] Error opening connection java.io.FileNotFoundException:

https://oss.sonatype.org/content/repositories/snapshots/de/flapdoodle/embed/de.flapdoodle.embed.mongo/2.1.2-SNAPSHOT/de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar
  [get] Can't get

https://oss.sonatype.org/content/repositories/snapshots/de/flapdoodle/embed/de.flapdoodle.embed.mongo/2.1.2-SNAPSHOT/de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar
to

C:\wip\mcf\trunk\connectors\mongodb\test-materials\de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar

BUILD FAILED
C:\wip\mcf\trunk\connectors\mongodb\build.xml:92: Can't get

https://oss.sonatype.org/content/repositories/snapshots/de/flapdoodle/embed/de.flapdoodle.embed.mongo/2.1.2-SNAPSHOT/de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar
to

C:\wip\mcf\trunk\connectors\mongodb\test-materials\de.flapdoodle.embed.mongo-2.1.2-20180621.063700-1.jar

Total time: 4 seconds

C:\wip\mcf\trunk\connectors\mongodb>

Any ideas?  Or do we just need to disable this test too?

Karl



Re: [VOTE] Release Apache ManifoldCF 2.16, RC0

2020-05-03 Thread Michael Cizmar
Great work Karl!  I'm looking forward to trying this out.

On Sun, May 3, 2020 at 1:31 PM Karl Wright  wrote:

> Please vote on whether to release Apache ManifoldCF 2.16, RC0.  This
> release has a new confluence connector as well as preliminary support for
> Java 11.  The release artifact can be found at:
>
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.16
>
> There is a release tag at:
>
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.16-RC0
>
> Thanks in advance!
> Karl
>


[jira] [Commented] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-05-02 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097963#comment-17097963
 ] 

Michael Cizmar commented on CONNECTORS-1639:


[~kwri...@metacarta.com] I modified the script slight to not need sudo and an 
issue I encountered on MacOS with XPack.  I disabled Machine Learning from 
XPack.  The script downloads both elasticsearch(version 7.6.2) and the 
ingest-attachment plugin.  The mapper-attachments was deprecated on version 6.  
The. unit tests will work without the mapper-attachments plugin.

.

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log, es_start.sh, es_stop.sh
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-05-02 Thread Michael Cizmar (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Cizmar updated CONNECTORS-1639:
---
Attachment: es_start.sh

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log, es_start.sh, es_stop.sh
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-05-02 Thread Michael Cizmar (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Cizmar updated CONNECTORS-1639:
---
Attachment: (was: es_start.sh)

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log, es_start.sh, es_stop.sh
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-05-02 Thread Michael Cizmar (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Cizmar updated CONNECTORS-1639:
---
Attachment: es_start.sh

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log, es_start.sh, es_stop.sh
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-05-02 Thread Michael Cizmar (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Cizmar updated CONNECTORS-1639:
---
Attachment: es_stop.sh

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log, es_start.sh, es_stop.sh
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-05-01 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097747#comment-17097747
 ] 

Michael Cizmar commented on CONNECTORS-1639:


This should spin up the instance that responds to local host:9200 then the java 
code changes I made should work.  What else is missing?  I can take a look 
tomorrow am.

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Should we continue to hold the 4/30 release until the ES testing is ironed out?

2020-05-01 Thread Michael Cizmar
Karl,

We created the script for you today and added it to:
https://issues.apache.org/jira/browse/CONNECTORS-1639

I had them make one to start it up and one to shut it down.  Please let me know 
if this works for you.

-- 
Michael Cizmar


On 5/1/20, 6:02 AM, "Karl Wright"  wrote:

I've got time this weekend to make code changes, but I don't actually know
how to proceed, so I think we're stuck.  What I need to have is
instructions on how to set up a modern ES release with the mapper
attachment or equivalent.  I currently download the latest ES release but
find that the mapper attachment available from the Maven repo is
incompatible with this version, and the new ES-supported mapper is not
available until ES 8.0.  Help?!?

If nobody knows how to resolve this right now, I would still release, and
delay the work for updating ES to next release.  Thoughts?

Karl



[jira] [Commented] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-05-01 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097658#comment-17097658
 ] 

Michael Cizmar commented on CONNECTORS-1639:


[^es_start.sh]

^[^es_stop.sh]^

^Attached are the scripts that perform the startup and shutdown.  Credit to 
Gustavo ​ Llermaly @ MC+A^

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Should we continue to hold the 4/30 release until the ES testing is ironed out?

2020-05-01 Thread Michael Cizmar
Karl.  We'll get you this script today.  Been slammed with end of month stuff.  
Sorry for the delay.



On 5/1/20, 6:02 AM, "Karl Wright"  wrote:

I've got time this weekend to make code changes, but I don't actually know
how to proceed, so I think we're stuck.  What I need to have is
instructions on how to set up a modern ES release with the mapper
attachment or equivalent.  I currently download the latest ES release but
find that the mapper attachment available from the Maven repo is
incompatible with this version, and the new ES-supported mapper is not
available until ES 8.0.  Help?!?

If nobody knows how to resolve this right now, I would still release, and
delay the work for updating ES to next release.  Thoughts?

Karl



Re: Status of Elastic Search integration tests

2020-04-29 Thread Michael Cizmar
Right.  On my radar is refactoring of this to use the Elastic Java SDK.  If
we use that then, in my view, the encoding of the document would be the
responsibility of the SDK and one less thing to test.  (The Java SDK is
somewhat complicated as well because they tend to rewrite underlying
transmission pieces)

For testing purposes currently we can install the mapper attachment and
create an ingestion workflow to hand that.

On Wed, Apr 29, 2020 at 6:44 AM Karl Wright  wrote:

> The connector itself encodes binary documents and sends them to ES,
> purportedly for the mapper attachment to process and convert to text.  The
> test used to exercise that.
>
> Perhaps it's worth reviewing the connector code itself to see what is
> outdated/legacy, and only test the parts that are not outdated?
>
> Specifically, my concern is that we need to support binary document
> transmission to ES, and ES obviously needs to handle those for the
> integration to work properly.
>
> Karl
>
>
> On Wed, Apr 29, 2020 at 7:27 AM Michael Cizmar 
> wrote:
>
> > There's been some changes to Elasticsearch like reducing the document
> types
> > and ingestion/mapper.  The mapper attachment I believe has been
> > deprecated in favor of:
> >
> >
> >
> https://www.elastic.co/guide/en/elasticsearch/plugins/master/ingest-attachment.html
> >
> > This should be incorporated into a pipeline.  Do we need something like
> > this in our integration test?  I don't think it's the responsibility of
> the
> > output connector to handle this.
> >
> > On Wed, Apr 29, 2020 at 5:25 AM Karl Wright  wrote:
> >
> > > Hello all,
> > >
> > > I set up a branch (branches/CONNECTORS-1639) to work on the
> elasticsearch
> > > test problem for JDK 11.  The branch downloads an ES and a mapper
> > > attachment but it turns out that the mapper attachment is apparently
> > > incompatible with the current (7.x) version of ES.  Does anyone know
> > > whether the mapper attachment is still supported?  If so, where can I
> > find
> > > it in the Maven repo?
> > >
> > > Karl
> > >
> >
>


Re: Status of Elastic Search integration tests

2020-04-29 Thread Michael Cizmar
There's been some changes to Elasticsearch like reducing the document types
and ingestion/mapper.  The mapper attachment I believe has been
deprecated in favor of:

https://www.elastic.co/guide/en/elasticsearch/plugins/master/ingest-attachment.html

This should be incorporated into a pipeline.  Do we need something like
this in our integration test?  I don't think it's the responsibility of the
output connector to handle this.

On Wed, Apr 29, 2020 at 5:25 AM Karl Wright  wrote:

> Hello all,
>
> I set up a branch (branches/CONNECTORS-1639) to work on the elasticsearch
> test problem for JDK 11.  The branch downloads an ES and a mapper
> attachment but it turns out that the mapper attachment is apparently
> incompatible with the current (7.x) version of ES.  Does anyone know
> whether the mapper attachment is still supported?  If so, where can I find
> it in the Maven repo?
>
> Karl
>


Re: Release schedule

2020-04-23 Thread Michael Cizmar
I can try this one out:
https://david.pilato.fr/blog/2016/10/18/elasticsearch-real-integration-tests-updated-for-ga/

Based on a review, this would seem to accomplish that.  There’s no remote
shutdown afaik.  You can download, unpack and then start elastic with the
commands necessary via the command line.  You then wait until it responds
on 9200 and you write out the PID.  Terminating it means killing the PID.

On Thu, Apr 23, 2020 at 6:55 AM Karl Wright  wrote:

> The problem with running anything under Ant is that it's not set up for
> this kind of flow:
>
> - start service
> - run tests
> - stop service
>
> Ant is about building and is not a sequential language, so starting this
> under Ant is the wrong idea.
>
> Instead, we can invoke scripts at will from within the java test class
> itself.  But we need both Linux and Windows scripts for that, and we
> download only one ES instance, and therefore we get only one kind of script
> to go with it.
>
> I supposed I can platform-conditionalize the download itself so we get
> different scripts for different platforms.  Since ES doesn't include all
> script variants in every download we're kind of stuck with this it seems.
>
> The other issue we have to address is waiting for ES to actually fully
> start.  I believe the code we had did this via a specific HTTP Get
> request fired at the instance, so maybe we can reuse that code.  There may
> also be a way to shut ES down via a similar HTTP Get mechanism.
>
> Can you verify that waiting for ES to come up and shutting down ES can be
> done with the same mechanism as is currently in the test code?
>
> Karl
>
>
> On Thu, Apr 23, 2020 at 7:44 AM Michael Cizmar 
> wrote:
>
> > Karl,
> >
> > I found this:
> >
> >
> https://gquintana.github.io/2016/11/30/Testing-a-Java-and-Elasticsearch-50-application.html
> >
> > Which should solve the issue of running elastic in the background via
> Ant.
> >
> > I can also provide a simple setup script (sh) if that helps as well.
> >
> > Michael
> >
> > On Wed, Apr 22, 2020 at 12:35 PM Karl Wright  wrote:
> >
> > > I looked into trying to get things working under Ant and created a
> branch
> > > CONNECTORS-1639 containing some changes relating to download of
> > > elasticsearch artifacts.  I did a little exploration as to whether we
> > could
> > > use the Elasticsearch Runner package to start a cluster, but that is
> > really
> > > painful because it has a ton of dependencies, so I think I'll just try
> > > calling the main class that the ES startup script uses and see how we
> do
> > > that way.
> > >
> > > But I'm snowed under with work related tasks again so it will have to
> > wait.
> > >
> > > Karl
> > >
> > >
> > > On Sat, Apr 18, 2020 at 5:52 PM Michael Cizmar <
> > mich...@michaelcizmar.com>
> > > wrote:
> > >
> > > > I've updated the ticket with the changes:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/projects/CONNECTORS/issues/CONNECTORS-1639
> > > >
> > > > On Sat, Apr 18, 2020 at 3:59 PM Cihad Guzel 
> wrote:
> > > >
> > > > > Thanks folks for your information.
> > > > >
> > > > > I reviewed the pom.xml of the  ES connector. It uses an old elastic
> > > > search
> > > > > version as dependency. It misled me. On the other hand, you are
> > right.
> > > I
> > > > > agree with your thoughts on this matter. It is best if we can
> > rearrange
> > > > > them.
> > > > >
> > > > > Kind regards,
> > > > > Cihad Guzel
> > > > >
> > > > >
> > > > > Karl Wright , 18 Nis 2020 Cmt, 17:49 tarihinde
> > > şunu
> > > > > yazdı:
> > > > >
> > > > > > I can help with Ant tasks but I need information as to how you're
> > > > > supposed
> > > > > > to start the ES instance.  An ant task snippet would be
> sufficient
> > I
> > > > > think.
> > > > > >
> > > > > >
> > > > > > On Sat, Apr 18, 2020 at 10:33 AM Michael Cizmar <
> > > > > > michael.ciz...@mcplusa.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I believe so.  I only modified the Pom in the es connector
> > project
> > > > and
> > > > > > > removed th

Re: Release schedule

2020-04-23 Thread Michael Cizmar
Karl,

I found this:
https://gquintana.github.io/2016/11/30/Testing-a-Java-and-Elasticsearch-50-application.html

Which should solve the issue of running elastic in the background via Ant.

I can also provide a simple setup script (sh) if that helps as well.

Michael

On Wed, Apr 22, 2020 at 12:35 PM Karl Wright  wrote:

> I looked into trying to get things working under Ant and created a branch
> CONNECTORS-1639 containing some changes relating to download of
> elasticsearch artifacts.  I did a little exploration as to whether we could
> use the Elasticsearch Runner package to start a cluster, but that is really
> painful because it has a ton of dependencies, so I think I'll just try
> calling the main class that the ES startup script uses and see how we do
> that way.
>
> But I'm snowed under with work related tasks again so it will have to wait.
>
> Karl
>
>
> On Sat, Apr 18, 2020 at 5:52 PM Michael Cizmar 
> wrote:
>
> > I've updated the ticket with the changes:
> >
> >
> https://issues.apache.org/jira/projects/CONNECTORS/issues/CONNECTORS-1639
> >
> > On Sat, Apr 18, 2020 at 3:59 PM Cihad Guzel  wrote:
> >
> > > Thanks folks for your information.
> > >
> > > I reviewed the pom.xml of the  ES connector. It uses an old elastic
> > search
> > > version as dependency. It misled me. On the other hand, you are right.
> I
> > > agree with your thoughts on this matter. It is best if we can rearrange
> > > them.
> > >
> > > Kind regards,
> > > Cihad Guzel
> > >
> > >
> > > Karl Wright , 18 Nis 2020 Cmt, 17:49 tarihinde
> şunu
> > > yazdı:
> > >
> > > > I can help with Ant tasks but I need information as to how you're
> > > supposed
> > > > to start the ES instance.  An ant task snippet would be sufficient I
> > > think.
> > > >
> > > >
> > > > On Sat, Apr 18, 2020 at 10:33 AM Michael Cizmar <
> > > > michael.ciz...@mcplusa.com>
> > > > wrote:
> > > >
> > > > > I believe so.  I only modified the Pom in the es connector project
> > and
> > > > > removed the Node references.  I know there is a way to do this in
> ant
> > > as
> > > > > well.  I will look it up but may need some guidance on Ant.
> > > > >
> > > > > Get Outlook for iOS<https://aka.ms/o0ukef>
> > > > > 
> > > > > From: Karl Wright 
> > > > > Sent: Saturday, April 18, 2020 9:15:15 AM
> > > > > To: dev 
> > > > > Subject: Re: Release schedule
> > > > >
> > > > > Hi Michael,
> > > > > This has to run under Ant as well.  Any way to make that happen?
> > > > >
> > > > > Karl
> > > > >
> > > > >
> > > > > On Sat, Apr 18, 2020 at 9:49 AM Michael Cizmar <
> > > > michael.ciz...@mcplusa.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > I've got a fix for this.  I switched to using a Maven plugin that
> > > spins
> > > > > up
> > > > > > an Elasticsearch instance.  With this, you need only to remove
> the
> > > Node
> > > > > > code in the integration tests.  Tested with 6.x client and 7.x
> > > > > > elasticsearch.
> > > > > >
> > > > > > There are more things we can do with this output plugin in the
> > future
> > > > > like
> > > > > > moving to the SDK.
> > > > > >
> > > > > > M
> > > > > >
> > > > > > On 4/18/20, 8:32 AM, "Karl Wright"  wrote:
> > > > > >
> > > > > > Thanks for the quick reply.
> > > > > > I agree we don't want to turn off the ES connector itself,
> but
> > > yes
> > > > we
> > > > > > will
> > > > > > need to shut down the tests.  Cihad, would you like to
> propose
> > a
> > > > > > strategy
> > > > > > for that?  I think for now just marking them with @Ignore
> > should
> > > be
> > > > > OK,
> > > > > > since the tests don't have compile time dependencies on
> missing
> > > > > > classes.
> > > > > > What do you think?
> > > > > >
> > > > > >   

[jira] [Commented] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-04-19 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17087254#comment-17087254
 ] 

Michael Cizmar commented on CONNECTORS-1639:


Was away from my desk, the setup script for something very vanilla for elastic 
would be:
 * Download elastic
 * unpack it
 * Run bin/elasticsearch

We could modify the configuration files that we would need but you would get a 
one node cluster to test.  I saw this as well just now and it's got a 'wait for 
elastic' to come up.

[https://github.com/dadoonet/fscrawler/blob/2829f49074ccc2692fb257e36c0d3be6300b3c41/src/test/ant/integration-tests.xm|https://github.com/dadoonet/fscrawler/blob/2829f49074ccc2692fb257e36c0d3be6300b3c41/src/test/ant/integration-tests.xml]

I will work on a script (sh) for you.

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-04-18 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086677#comment-17086677
 ] 

Michael Cizmar commented on CONNECTORS-1639:


Right.  If you upgrade/remove the library causing the issue, I think everything 
in this connector is JDK 11 ready.

 

Given the direction by Elastic, I went with starting a cluster outside, but I 
think that one will also work.  The plugin is in Maven Central repository here:

[https://mvnrepository.com/artifact/com.github.alexcojocaru/elasticsearch-maven-plugin]

I will incorporate the other lib in a diff.  

 

 

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Release schedule

2020-04-18 Thread Michael Cizmar
I've updated the ticket with the changes:

https://issues.apache.org/jira/projects/CONNECTORS/issues/CONNECTORS-1639

On Sat, Apr 18, 2020 at 3:59 PM Cihad Guzel  wrote:

> Thanks folks for your information.
>
> I reviewed the pom.xml of the  ES connector. It uses an old elastic search
> version as dependency. It misled me. On the other hand, you are right. I
> agree with your thoughts on this matter. It is best if we can rearrange
> them.
>
> Kind regards,
> Cihad Guzel
>
>
> Karl Wright , 18 Nis 2020 Cmt, 17:49 tarihinde şunu
> yazdı:
>
> > I can help with Ant tasks but I need information as to how you're
> supposed
> > to start the ES instance.  An ant task snippet would be sufficient I
> think.
> >
> >
> > On Sat, Apr 18, 2020 at 10:33 AM Michael Cizmar <
> > michael.ciz...@mcplusa.com>
> > wrote:
> >
> > > I believe so.  I only modified the Pom in the es connector project and
> > > removed the Node references.  I know there is a way to do this in ant
> as
> > > well.  I will look it up but may need some guidance on Ant.
> > >
> > > Get Outlook for iOS<https://aka.ms/o0ukef>
> > > 
> > > From: Karl Wright 
> > > Sent: Saturday, April 18, 2020 9:15:15 AM
> > > To: dev 
> > > Subject: Re: Release schedule
> > >
> > > Hi Michael,
> > > This has to run under Ant as well.  Any way to make that happen?
> > >
> > > Karl
> > >
> > >
> > > On Sat, Apr 18, 2020 at 9:49 AM Michael Cizmar <
> > michael.ciz...@mcplusa.com
> > > >
> > > wrote:
> > >
> > > > I've got a fix for this.  I switched to using a Maven plugin that
> spins
> > > up
> > > > an Elasticsearch instance.  With this, you need only to remove the
> Node
> > > > code in the integration tests.  Tested with 6.x client and 7.x
> > > > elasticsearch.
> > > >
> > > > There are more things we can do with this output plugin in the future
> > > like
> > > > moving to the SDK.
> > > >
> > > > M
> > > >
> > > > On 4/18/20, 8:32 AM, "Karl Wright"  wrote:
> > > >
> > > > Thanks for the quick reply.
> > > > I agree we don't want to turn off the ES connector itself, but
> yes
> > we
> > > > will
> > > > need to shut down the tests.  Cihad, would you like to propose a
> > > > strategy
> > > > for that?  I think for now just marking them with @Ignore should
> be
> > > OK,
> > > > since the tests don't have compile time dependencies on missing
> > > > classes.
> > > > What do you think?
> > > >
> > > > Upgrading to ES 6.x is obviously the right thing to do but who
> here
> > > > has the
> > > > knowledge to do a good job with this?  I am certain there are a
> > > number
> > > > of
> > > > ES users lurking on this list.  Please volunteer if so.
> > > >
> > > > Karl
> > > >
> > > >
> > > > On Sat, Apr 18, 2020 at 9:15 AM Furkan KAMACI <
> > > furkankam...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > There is a compatibility matrix for Elasticsearch. We need to
> > > > support at
> > > > > least Elasticsearch 6.5.x for Java 11 support. You can check it
> > > from
> > > > here:
> > > > > https://www.elastic.co/de/support/matrix#matrix_jvm
> > > > >
> > > > > @Cihad
> > > > >
> > > > > As far as I know, current support is not 2.0.0. It is 5.5.2:
> > > > >
> > https://github.com/apache/manifoldcf-integration-elasticsearch-5.5
> > > > >
> > > > > @Karl Wright 
> > > > >
> > > > > So, such an upgrade from 5.5.2 to 6.5.x may not be so painful.
> > > > Committers
> > > > > who use ES can comment on this.
> > > > >
> > > > > My comments:
> > > > >
> > > > > +1 to temporarily turning those tests off
> > > > > -1 to temporarily turning the connector off
> > > > >
> > > > > Kind Regards,
> > > > > F

[jira] [Updated] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-04-18 Thread Michael Cizmar (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Cizmar updated CONNECTORS-1639:
---
Attachment: CONNECTORS-1639.diff

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: CONNECTORS-1639.diff, 
> elastic-search-1.0.1-java11-build-error.log
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Release schedule

2020-04-18 Thread Michael Cizmar
I believe so.  I only modified the Pom in the es connector project and removed 
the Node references.  I know there is a way to do this in ant as well.  I will 
look it up but may need some guidance on Ant.

Get Outlook for iOS<https://aka.ms/o0ukef>

From: Karl Wright 
Sent: Saturday, April 18, 2020 9:15:15 AM
To: dev 
Subject: Re: Release schedule

Hi Michael,
This has to run under Ant as well.  Any way to make that happen?

Karl


On Sat, Apr 18, 2020 at 9:49 AM Michael Cizmar 
wrote:

> I've got a fix for this.  I switched to using a Maven plugin that spins up
> an Elasticsearch instance.  With this, you need only to remove the Node
> code in the integration tests.  Tested with 6.x client and 7.x
> elasticsearch.
>
> There are more things we can do with this output plugin in the future like
> moving to the SDK.
>
> M
>
> On 4/18/20, 8:32 AM, "Karl Wright"  wrote:
>
> Thanks for the quick reply.
> I agree we don't want to turn off the ES connector itself, but yes we
> will
> need to shut down the tests.  Cihad, would you like to propose a
> strategy
> for that?  I think for now just marking them with @Ignore should be OK,
> since the tests don't have compile time dependencies on missing
> classes.
> What do you think?
>
> Upgrading to ES 6.x is obviously the right thing to do but who here
> has the
> knowledge to do a good job with this?  I am certain there are a number
> of
> ES users lurking on this list.  Please volunteer if so.
>
> Karl
>
>
> On Sat, Apr 18, 2020 at 9:15 AM Furkan KAMACI 
> wrote:
>
> > Hi,
> >
> > There is a compatibility matrix for Elasticsearch. We need to
> support at
> > least Elasticsearch 6.5.x for Java 11 support. You can check it from
> here:
> > https://www.elastic.co/de/support/matrix#matrix_jvm
> >
> > @Cihad
> >
> > As far as I know, current support is not 2.0.0. It is 5.5.2:
> > https://github.com/apache/manifoldcf-integration-elasticsearch-5.5
> >
> > @Karl Wright 
> >
> > So, such an upgrade from 5.5.2 to 6.5.x may not be so painful.
> Committers
> > who use ES can comment on this.
> >
> > My comments:
> >
> > +1 to temporarily turning those tests off
> > -1 to temporarily turning the connector off
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Sat, Apr 18, 2020 at 3:27 PM Cihad Guzel 
> wrote:
> >
> >> Hi Karl,
> >>
> >> MFC ES Connector uses the Elastic Search 2.0.0 . It's an ancient
> version.
> >> The latest version is 7.6.2 . So, I agree with you and I think we
> can
> >> temporarily turn the connector off.
> >>
> >> +1
> >>
> >> Kind Regards,
> >> Cihad Güzel
> >>
> >>
> >> Karl Wright , 18 Nis 2020 Cmt, 11:41 tarihinde
> şunu
> >> yazdı:
> >>
> >> > Hi all,
> >> >
> >> > We're due to release ManifoldCF 2.16 by April 30th.  The major
> work for
> >> > this release was adoption of Java 11, and that work is incomplete
> >> because
> >> > of ElasticSearch incompatibilities.  I'm therefore tempted to
> hold the
> >> > release until we at least have a plan for dealing with ES going
> forward.
> >> >
> >> > It's not clear that our ES connector support is affected, but
> certainly
> >> our
> >> > integration tests are, because Java 11 isn't supported in any of
> the ES
> >> > versions we run for those tests.  So at the least we need to
> decide to
> >> turn
> >> > those off.  And indeed, we really need to have someone with ES
> >> experience
> >> > map a strategy for getting our ES support back into compliance
> with
> >> what's
> >> > out in the world at large now.  Cihad Guzel did much work on Java
> 11 but
> >> > stumbled over the Elastic Search problem.  Any of our committers
> who
> >> know
> >> > ES and are stuck inside at the moment, please speak up.
> >> >
> >> > Thanks in advance,
> >> > Karl
> >> >
> >>
> >
>
>


Re: Release schedule

2020-04-18 Thread Michael Cizmar
There is a check in

Caused by: java.lang.UnsupportedOperationException: Boot class path mechanism 
is not supported
at 
java.management/sun.management.RuntimeImpl.getBootClassPath(RuntimeImpl.java:99)
at org.elasticsearch.monitor.jvm.JvmInfo.(JvmInfo.java:77)
at 
org.elasticsearch.node.internal.InternalNode.(InternalNode.java:132)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.node.NodeBuilder.node(NodeBuilder.java:166)

In the 2.0 because in the integration test it is creating an Elastic node to 
submitted to.  Embedding ES is not supported by Elastic so they have made this 
difficult to do.   We really do not need to do this  anyway.  You can spin up 
an ES instance outside via a Maven Plugin.  The unit tests code calls ES 
directly so with that, you can upgrade the libraries and it works on the later 
versions and Java 11.

I’ll clean up what I worked on and submitted it for consideration.

Michael


From: Furkan KAMACI 
Date: Saturday, April 18, 2020 at 9:04 AM
To: Michael Cizmar 
Cc: "dev@manifoldcf.apache.org" 
Subject: Re: Release schedule

Hi,

By the way, when I check the error here: 
https://issues.apache.org/jira/secure/attachment/12997363/elastic-search-1.0.1-java11-build-error.log

Seems that it is not specific to Java 11: 
https://stackoverflow.com/questions/46636954/elasticsearch-1-1-2-not-starting-properly

Upgrading ES to minimum Java 9 support version which is 6.2.x can be fine: 
https://www.elastic.co/de/support/matrix#matrix_jvm

Kind Regards,
Furkan KAMACI

On Sat, Apr 18, 2020 at 4:49 PM Michael Cizmar 
mailto:michael.ciz...@mcplusa.com>> wrote:
I've got a fix for this.  I switched to using a Maven plugin that spins up an 
Elasticsearch instance.  With this, you need only to remove the Node code in 
the integration tests.  Tested with 6.x client and 7.x elasticsearch.

There are more things we can do with this output plugin in the future like 
moving to the SDK.

M

On 4/18/20, 8:32 AM, "Karl Wright" 
mailto:daddy...@gmail.com>> wrote:

Thanks for the quick reply.
I agree we don't want to turn off the ES connector itself, but yes we will
need to shut down the tests.  Cihad, would you like to propose a strategy
for that?  I think for now just marking them with @Ignore should be OK,
since the tests don't have compile time dependencies on missing classes.
What do you think?

Upgrading to ES 6.x is obviously the right thing to do but who here has the
knowledge to do a good job with this?  I am certain there are a number of
ES users lurking on this list.  Please volunteer if so.

Karl


On Sat, Apr 18, 2020 at 9:15 AM Furkan KAMACI 
mailto:furkankam...@gmail.com>>
wrote:

> Hi,
>
> There is a compatibility matrix for Elasticsearch. We need to support at
> least Elasticsearch 6.5.x for Java 11 support. You can check it from here:
> https://www.elastic.co/de/support/matrix#matrix_jvm
>
> @Cihad
>
> As far as I know, current support is not 2.0.0. It is 5.5.2:
> https://github.com/apache/manifoldcf-integration-elasticsearch-5.5
>
> @Karl Wright mailto:daddy...@gmail.com>>
>
> So, such an upgrade from 5.5.2 to 6.5.x may not be so painful. Committers
> who use ES can comment on this.
>
> My comments:
>
> +1 to temporarily turning those tests off
> -1 to temporarily turning the connector off
>
> Kind Regards,
> Furkan KAMACI
>
> On Sat, Apr 18, 2020 at 3:27 PM Cihad Guzel 
mailto:cguz...@gmail.com>> wrote:
>
>> Hi Karl,
>>
>> MFC ES Connector uses the Elastic Search 2.0.0 . It's an ancient version.
>> The latest version is 7.6.2 . So, I agree with you and I think we can
>> temporarily turn the connector off.
>>
>> +1
>>
>> Kind Regards,
>> Cihad Güzel
>>
>>
>> Karl Wright mailto:daddy...@gmail.com>>, 18 Nis 2020 
Cmt, 11:41 tarihinde şunu
>> yazdı:
>>
>> > Hi all,
>> >
>> > We're due to release ManifoldCF 2.16 by April 30th.  The major work for
>> > this release was adoption of Java 11, and that work is incomplete
>> because
>> > of ElasticSearch incompatibilities.  I'm therefore tempted to hold the
>> > release until we at least have a plan for dealing with ES going 
forward.
>> >
>> > It's not clear that our ES connector support is affected, but certainly
>> our
>> > integration tests are, because Java 11 isn't supported in any of the ES
>> > versions we run for those tests.  So at the least we need to decide to
>> turn
>&

Re: Release schedule

2020-04-18 Thread Michael Cizmar
I've got a fix for this.  I switched to using a Maven plugin that spins up an 
Elasticsearch instance.  With this, you need only to remove the Node code in 
the integration tests.  Tested with 6.x client and 7.x elasticsearch.

There are more things we can do with this output plugin in the future like 
moving to the SDK.

M

On 4/18/20, 8:32 AM, "Karl Wright"  wrote:

Thanks for the quick reply.
I agree we don't want to turn off the ES connector itself, but yes we will
need to shut down the tests.  Cihad, would you like to propose a strategy
for that?  I think for now just marking them with @Ignore should be OK,
since the tests don't have compile time dependencies on missing classes.
What do you think?

Upgrading to ES 6.x is obviously the right thing to do but who here has the
knowledge to do a good job with this?  I am certain there are a number of
ES users lurking on this list.  Please volunteer if so.

Karl


On Sat, Apr 18, 2020 at 9:15 AM Furkan KAMACI 
wrote:

> Hi,
>
> There is a compatibility matrix for Elasticsearch. We need to support at
> least Elasticsearch 6.5.x for Java 11 support. You can check it from here:
> https://www.elastic.co/de/support/matrix#matrix_jvm
>
> @Cihad
>
> As far as I know, current support is not 2.0.0. It is 5.5.2:
> https://github.com/apache/manifoldcf-integration-elasticsearch-5.5
>
> @Karl Wright 
>
> So, such an upgrade from 5.5.2 to 6.5.x may not be so painful. Committers
> who use ES can comment on this.
>
> My comments:
>
> +1 to temporarily turning those tests off
> -1 to temporarily turning the connector off
>
> Kind Regards,
> Furkan KAMACI
>
> On Sat, Apr 18, 2020 at 3:27 PM Cihad Guzel  wrote:
>
>> Hi Karl,
>>
>> MFC ES Connector uses the Elastic Search 2.0.0 . It's an ancient version.
>> The latest version is 7.6.2 . So, I agree with you and I think we can
>> temporarily turn the connector off.
>>
>> +1
>>
>> Kind Regards,
>> Cihad Güzel
>>
>>
>> Karl Wright , 18 Nis 2020 Cmt, 11:41 tarihinde şunu
>> yazdı:
>>
>> > Hi all,
>> >
>> > We're due to release ManifoldCF 2.16 by April 30th.  The major work for
>> > this release was adoption of Java 11, and that work is incomplete
>> because
>> > of ElasticSearch incompatibilities.  I'm therefore tempted to hold the
>> > release until we at least have a plan for dealing with ES going 
forward.
>> >
>> > It's not clear that our ES connector support is affected, but certainly
>> our
>> > integration tests are, because Java 11 isn't supported in any of the ES
>> > versions we run for those tests.  So at the least we need to decide to
>> turn
>> > those off.  And indeed, we really need to have someone with ES
>> experience
>> > map a strategy for getting our ES support back into compliance with
>> what's
>> > out in the world at large now.  Cihad Guzel did much work on Java 11 
but
>> > stumbled over the Elastic Search problem.  Any of our committers who
>> know
>> > ES and are stuck inside at the moment, please speak up.
>> >
>> > Thanks in advance,
>> > Karl
>> >
>>
>



Re: Release schedule

2020-04-18 Thread Michael Cizmar
Right.  The issue is that the unit tests rely on embedding an elasticsearch 
instance which is officially deprecated:
https://www.elastic.co/blog/elasticsearch-the-server

There are integration methods using ESIntegTestCase.  I'm looking if we can 
quickly refactor to utilize that or the unit tests are going to need to be 
written.

The code is more or less fine for what I can see.



On 4/18/20, 8:22 AM, "Furkan KAMACI"  wrote:

Hi,

There is a compatibility matrix for Elasticsearch. We need to support at
least Elasticsearch 6.5.x for Java 11 support. You can check it from here:
https://www.elastic.co/de/support/matrix#matrix_jvm

@Cihad

As far as I know, current support is not 2.0.0. It is 5.5.2:
https://github.com/apache/manifoldcf-integration-elasticsearch-5.5

@Karl Wright 

So, such an upgrade from 5.5.2 to 6.5.x may not be so painful. Committers
who use ES can comment on this.

My comments:

+1 to temporarily turning those tests off
-1 to temporarily turning the connector off

Kind Regards,
Furkan KAMACI

On Sat, Apr 18, 2020 at 3:27 PM Cihad Guzel  wrote:

> Hi Karl,
>
> MFC ES Connector uses the Elastic Search 2.0.0 . It's an ancient version.
> The latest version is 7.6.2 . So, I agree with you and I think we can
> temporarily turn the connector off.
>
> +1
>
> Kind Regards,
> Cihad Güzel
>
>
> Karl Wright , 18 Nis 2020 Cmt, 11:41 tarihinde şunu
> yazdı:
>
> > Hi all,
> >
> > We're due to release ManifoldCF 2.16 by April 30th.  The major work for
> > this release was adoption of Java 11, and that work is incomplete 
because
> > of ElasticSearch incompatibilities.  I'm therefore tempted to hold the
> > release until we at least have a plan for dealing with ES going forward.
> >
> > It's not clear that our ES connector support is affected, but certainly
> our
> > integration tests are, because Java 11 isn't supported in any of the ES
> > versions we run for those tests.  So at the least we need to decide to
> turn
> > those off.  And indeed, we really need to have someone with ES 
experience
> > map a strategy for getting our ES support back into compliance with
> what's
> > out in the world at large now.  Cihad Guzel did much work on Java 11 but
> > stumbled over the Elastic Search problem.  Any of our committers who 
know
> > ES and are stuck inside at the moment, please speak up.
> >
> > Thanks in advance,
> > Karl
> >
>



[jira] [Comment Edited] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-03-22 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17064416#comment-17064416
 ] 

Michael Cizmar edited comment on CONNECTORS-1639 at 3/22/20, 8:53 PM:
--

The output connector uses is 1.0.1 for an integration test.   In the actual 
connector, it uses more or less raw rest calls to interact with Elasticsearch 
and there is no dependency between the test case and the version of Elastic 
that is supported as the error that is being thrown is related to the 
integration test trying to start an Elastic node.  We should upgrade this to a 
more recent version perhaps.

[~kwri...@metacarta.com] manifoldcf does not support Java 11 yet, correct?


was (Author: michaelcizmar):
The output connector uses is 1.0.1 for an integration test.   In the actual 
connector, it uses more or less raw rest calls to interact with Elasticsearch 
and there is no depency between the test case and the version of Elastic that 
is supported as the error that is being thrown is related to the integration 
test trying to start an Elastic node.  We should upgrade this to a more recent 
version perhaps.

[~kwri...@metacarta.com] manifoldcf does not support Java 11 yet, correct?

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: elastic-search-1.0.1-java11-build-error.log
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1639) Upgrade Elastic Search Version

2020-03-22 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17064416#comment-17064416
 ] 

Michael Cizmar commented on CONNECTORS-1639:


The output connector uses is 1.0.1 for an integration test.   In the actual 
connector, it uses more or less raw rest calls to interact with Elasticsearch 
and there is no depency between the test case and the version of Elastic that 
is supported as the error that is being thrown is related to the integration 
test trying to start an Elastic node.  We should upgrade this to a more recent 
version perhaps.

[~kwri...@metacarta.com] manifoldcf does not support Java 11 yet, correct?

> Upgrade Elastic Search Version
> --
>
> Key: CONNECTORS-1639
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1639
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Reporter: Cihad Guzel
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.16
>
> Attachments: elastic-search-1.0.1-java11-build-error.log
>
>
> Current Elastic Search version is 1.0.1 . According to [this 
> matrix|https://www.elastic.co/support/matrix#matrix_jvm], Java 11 is not 
> supported by any ES version below 6.5.
> Besides, ES 1.x is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: ElasticSearch-literate volunteers needed for JDK 11 port

2020-03-22 Thread Michael Cizmar
We can take a look.

Get Outlook for iOS

From: Karl Wright 
Sent: Sunday, March 22, 2020 10:05:40 AM
To: dev 
Subject: ElasticSearch-literate volunteers needed for JDK 11 port

Hi All,

The version of ElasticSearch we support apparently is incompatible with JDK
11.  We therefore will need to update that connector and the associated
tests as well.  See:

https://issues.apache.org/jira/browse/CONNECTORS-1624?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel=17059835


I would ask anyone with any recent experience with ElasticSearch to please
step forward and lend a hand with this exercise.  Any volunteers?

Thanks in advance,
Karl


java.net.SocketException: Socket is closed

2020-03-16 Thread Michael Cizmar
Wondering if anyone else has seen this when configuring a forms
authentication rule via Access Credentials.  When the post to the form is
made...it times out and returns the exception below.  The form looks
corrected.

Thanks,

M
-
DEBUG 2020-03-16T21:05:48,423 (Thread-7314) - Connection released: [id:
1][route: {s}->https://mywebsite.company.com:443][total kept alive: 0;
route allocated: 0 of 1; total allocated: 0 of 20]
DEBUG 2020-03-16T21:05:48,423 (Thread-7314) - Cancelling request execution
DEBUG 2020-03-16T21:05:48,423 (Worker thread '0') - Web: IO exception
(java.net.SocketException)reading header for '
https://mywebsite.company.com/pkmslogin.form', retrying
java.net.SocketException: Socket is closed
at java.net.Socket.setSoTimeout(Socket.java:1155) ~[?:1.8.0_232]
at
sun.security.ssl.BaseSSLSocketImpl.setSoTimeout(BaseSSLSocketImpl.java:633)
~[?:1.8.0_232]
at
sun.security.ssl.SSLSocketImpl.setSoTimeout(SSLSocketImpl.java:2556)
~[?:1.8.0_232]
at
org.apache.http.impl.BHttpConnectionBase.fillInputBuffer(BHttpConnectionBase.java:346)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.BHttpConnectionBase.awaitInput(BHttpConnectionBase.java:354)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.DefaultBHttpClientConnection.isResponseAvailable(DefaultBHttpClientConnection.java:130)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.conn.CPoolProxy.isResponseAvailable(CPoolProxy.java:142)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:219)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ExecuteMethodThread.run(ThrottledFetcher.java:1266)
~[?:?]
 INFO 2020-03-16T21:05:48,425 (Worker thread '0') - WEB: FETCH LOGIN|
https://mywebsite.company.com/pkmslogin.form|1584410688357+60067|-103|0|java.net.SocketException|
Socket is closed
DEBUG 2020-03-16T21:05:48,425 (Worker thread '0') - WEB: Fetch exception
for 'https://mywebsite.company.com/pkmslogin.form'
java.net.SocketException: Socket is closed
at java.net.Socket.setSoTimeout(Socket.java:1155) ~[?:1.8.0_232]
at
sun.security.ssl.BaseSSLSocketImpl.setSoTimeout(BaseSSLSocketImpl.java:633)
~[?:1.8.0_232]
at
sun.security.ssl.SSLSocketImpl.setSoTimeout(SSLSocketImpl.java:2556)
~[?:1.8.0_232]
at
org.apache.http.impl.BHttpConnectionBase.fillInputBuffer(BHttpConnectionBase.java:346)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.BHttpConnectionBase.awaitInput(BHttpConnectionBase.java:354)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.DefaultBHttpClientConnection.isResponseAvailable(DefaultBHttpClientConnection.java:130)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.conn.CPoolProxy.isResponseAvailable(CPoolProxy.java:142)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:219)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
~[httpclient-4.5.8.jar:4.5.8]
at
org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ExecuteMethodThread.run(ThrottledFetcher.java:1266)
~[?:?]
 WARN 2020-03-16T21:05:48,426 (Worker thread '0') - Service interruption
reported for job 

[jira] [Commented] (CONNECTORS-1633) Exception tossed: Repeated service interruptions - failure processing document: The process cannot access the file because it is being used by another process.

2020-01-24 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023000#comment-17023000
 ] 

Michael Cizmar commented on CONNECTORS-1633:


I don't think the whole job should be aborted because of a file share.  In 
investigating this more on my side it looks like this happens when it 
encounters a specific file that's 63 MB.  If the size is the cause, then it's 
strange to get a file lock issue.  

I think the process should be it traps the error, logs the event in a pretty 
way, and potentially maintains a record of the document to attempt to retrieve 
later.  Again in my case, the file share is a group folder so we can't control 
what goes in it and I agree with you about the notion of skipping. 

> Exception tossed: Repeated service interruptions - failure processing 
> document: The process cannot access the file because it is being used by 
> another process.
> ---
>
> Key: CONNECTORS-1633
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1633
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: File system connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
>
> Seeing this error occurring and I'm working to address it.  If it's not a 
> bug, a better message should be generated.
>  
> {code:java}
> crawl job fails with the following error due to document being in use by 
> another user: 
>  WARN 2019-08-25T15:02:27,416 (Worker thread '11') - Service interruption 
> reported for job 1565115290083 connection 'fs_vwoaahvp319': Timeout or other 
> service interruption: The process cannot access the file because it is being 
> used by another process.
> ERROR 2019-08-25T15:02:27,424 (Worker thread '11') - Exception tossed: 
> Repeated service interruptions - failure processing document: The process 
> cannot access the file because it is being used by another process.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
> interruptions - failure processing document: The process cannot access the 
> file because it is being used by another process.
>         at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:489) 
> [mcf-pull-agent.jar:?]
> Caused by: jcifs.smb.SmbException: The process cannot access the file because 
> it is being used by another process.
>         at 
> jcifs.smb.SmbTransportImpl.checkStatus2(SmbTransportImpl.java:1457) ~[?:?]
>         at jcifs.smb.SmbTransportImpl.checkStatus(SmbTransportImpl.java:1568) 
> ~[?:?]
>         at jcifs.smb.SmbTransportImpl.sendrecv(SmbTransportImpl.java:1023) 
> ~[?:?]
>         at jcifs.smb.SmbTransportImpl.send(SmbTransportImpl.java:1539) ~[?:?]
>         at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:409) ~[?:?]
>         at jcifs.smb.SmbTreeImpl.send(SmbTreeImpl.java:472) ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send0(SmbTreeConnection.java:401) 
> ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:315) ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:295) ~[?:?]
>         at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:130) ~[?:?]
>         at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:117) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1741) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1710) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1704) ~[?:?]
>         at jcifs.smb.SmbFile.queryPath(SmbFile.java:770) ~[?:?]
>         at jcifs.smb.SmbFile.exists(SmbFile.java:851) ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.fileExists(SharedDriveConnector.java:2188)
>  ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:610)
>  ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) 
> ~[mcf-pull-agent.jar:?]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CONNECTORS-1633) Exception tossed: Repeated service interruptions - failure processing document: The process cannot access the file because it is being used by another process.

2020-01-24 Thread Michael Cizmar (Jira)
Michael Cizmar created CONNECTORS-1633:
--

 Summary: Exception tossed: Repeated service interruptions - 
failure processing document: The process cannot access the file because it is 
being used by another process.
 Key: CONNECTORS-1633
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1633
 Project: ManifoldCF
  Issue Type: Bug
  Components: File system connector
Affects Versions: ManifoldCF 2.13
Reporter: Michael Cizmar


Seeing this error occurring and I'm working to address it.  If it's not a bug, 
a better message should be generated.

 
{code:java}
crawl job fails with the following error due to document being in use by 
another user: 
 WARN 2019-08-25T15:02:27,416 (Worker thread '11') - Service interruption 
reported for job 1565115290083 connection 'fs_vwoaahvp319': Timeout or other 
service interruption: The process cannot access the file because it is being 
used by another process.
ERROR 2019-08-25T15:02:27,424 (Worker thread '11') - Exception tossed: Repeated 
service interruptions - failure processing document: The process cannot access 
the file because it is being used by another process.
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
interruptions - failure processing document: The process cannot access the file 
because it is being used by another process.
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:489) 
[mcf-pull-agent.jar:?]
Caused by: jcifs.smb.SmbException: The process cannot access the file because 
it is being used by another process.
        at jcifs.smb.SmbTransportImpl.checkStatus2(SmbTransportImpl.java:1457) 
~[?:?]
        at jcifs.smb.SmbTransportImpl.checkStatus(SmbTransportImpl.java:1568) 
~[?:?]
        at jcifs.smb.SmbTransportImpl.sendrecv(SmbTransportImpl.java:1023) 
~[?:?]
        at jcifs.smb.SmbTransportImpl.send(SmbTransportImpl.java:1539) ~[?:?]
        at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:409) ~[?:?]
        at jcifs.smb.SmbTreeImpl.send(SmbTreeImpl.java:472) ~[?:?]
        at jcifs.smb.SmbTreeConnection.send0(SmbTreeConnection.java:401) ~[?:?]
        at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:315) ~[?:?]
        at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:295) ~[?:?]
        at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:130) ~[?:?]
        at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:117) ~[?:?]
        at jcifs.smb.SmbFile.withOpen(SmbFile.java:1741) ~[?:?]
        at jcifs.smb.SmbFile.withOpen(SmbFile.java:1710) ~[?:?]
        at jcifs.smb.SmbFile.withOpen(SmbFile.java:1704) ~[?:?]
        at jcifs.smb.SmbFile.queryPath(SmbFile.java:770) ~[?:?]
        at jcifs.smb.SmbFile.exists(SmbFile.java:851) ~[?:?]
        at 
org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.fileExists(SharedDriveConnector.java:2188)
 ~[?:?]
        at 
org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:610)
 ~[?:?]
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) 
~[mcf-pull-agent.jar:?]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1608) Connecting ADFS user using SharePoint Connector

2020-01-24 Thread Michael Cizmar (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022934#comment-17022934
 ] 

Michael Cizmar commented on CONNECTORS-1608:


ADFS makes some additional redirects.  You should be able to make a user in 
Sharepoint which will resolve this issue.

> Connecting ADFS user using SharePoint Connector
> ---
>
> Key: CONNECTORS-1608
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1608
> Project: ManifoldCF
>  Issue Type: Wish
>  Components: SharePoint 2013 MCPermissions extension, SharePoint 
> connector
> Environment: SharePoint 2013
>Reporter: NEHAL BHANDARI
>Priority: Critical
>
> I have ADFS enable on sharepoint webapplication. I am trying to create a 
> repository connection that authenticates the ADFS User to SharePoint. It 
> gives me an error 403. It works well when I enable NTLM only but once ADFS is 
> enabled it gives me 403. Is there anything I am missing or is it something 
> which is not possible.
> My Adfs user looks like i:05.t|adfs|, But I don't know how to make this user 
> work to authenticate it on SharePoint WebApplication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CONNECTORS-1619) When you edit the Elastic Output connector, you need to set the password again

2019-08-16 Thread Michael Cizmar (JIRA)
Michael Cizmar created CONNECTORS-1619:
--

 Summary: When you edit the Elastic Output connector, you need to 
set the password again
 Key: CONNECTORS-1619
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1619
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 2.14
Reporter: Michael Cizmar


When a user reconfigures an output connector, the password seems to be lost.

Steps to reproduce:

1) Click edit on an existing elastic output connector

2) Update something, like index settings

3) Click save

 

See the connector status as being invalid.

Work around...edit and save the password along.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: ManifoldCF Resource Offering

2019-08-16 Thread Michael Cizmar
Yes, absolutely.  We'll program to the core interfaces, get them
documented, and unit tested.  Typically our first step is to mock the
service that we are programming against and the approach the development in
a TDD fashion.  In our world, we don't have extensive debugging
capabilities so we have to rely on great logs that the client parses and
sends to us for example.

Ideally with the libraries also going into Maven, that might help us
decouple some of these things as well.

I am a huge fan of incremental progress versus Big Bang.

On Fri, Aug 16, 2019 at 8:42 AM Karl Wright  wrote:

> If you could structure this as one contribution at a time I would be
> grateful.  That way, problems with one contribution won't block others.
>
> We also like to have an integration test for each connector.  Not all of
> them have this at this time, but it's very very helpful if you have the
> time to put one in place.  There are good examples for the Alfresco WS
> connector and the CMIS connector and the File System connector; you can use
> these as templates perhaps.
>
> If the connectors haven't yet been ported, it's a good idea to read at
> least sections of The Book.  You can see them for free here:
>
> https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs
>
> Karl
>
>
> On Fri, Aug 16, 2019 at 8:33 AM Michael Cizmar 
> wrote:
>
> > For better or worse, my company (MC+A <https://www.mcplusa.com/>) was
> know
> > for a period of time as a 'connector' company although we are a
> > consultancy.  What I'm thinking initially:
> >
> >1. Connector development for repositories like (effectively looking at
> >what we have and migrating them to overtime ManifoldCF):
> >   1. box
> >   2. lithium
> >   3. jive
> >   4. confluence
> >   5. Sharepoint
> >   6. Salesforce
> >   7. ServiceNow
> >   8. OneDrive
> >2. Output connector to
> >   1. Elastic App Search
> >   2. Elastic Enterprise Search
> >3. Improving the end user experience
> >   1. Updating the documentation
> >   2. Providing some samples
> >4. Bugs as we find them
> >
> >
> >
> > On Fri, Aug 16, 2019 at 2:55 AM Furkan KAMACI 
> > wrote:
> >
> > > Hi Michael,
> > >
> > > Thanks for your interest!
> > >
> > > Community is important for FOSS projects. It can both help to develop
> > such
> > > improvements and use it as may need them.
> > >
> > > So, in my opinion you can share which parts do you want to improve/add
> > and
> > > share it with us for both some can help you and need them.
> > >
> > > Kind Regards,
> > > Furkan KAMACI
> > >
> > > 16 Ağu 2019 Cum, saat 07:13 tarihinde Karl Wright 
> > > şunu
> > > yazdı:
> > >
> > > > Excellent!
> > > > Thanks for contacting us.  I hope we can help.
> > > >
> > > > Karl
> > > >
> > > > On Thu, Aug 15, 2019 at 9:17 PM Michael Cizmar <
> > > mich...@michaelcizmar.com>
> > > > wrote:
> > > >
> > > > > Hey Folks,
> > > > >
> > > > > I exchanged emails with Karl earlier this week, but I wanted to
> reach
> > > out
> > > > > to you to offer up some additional assistance for the ManifoldCF
> > > project.
> > > > > My firm has previously written about a dozen commercial connectors
> > > > > primarily targeted to the Google Search Appliance.  We are starting
> > to
> > > > use
> > > > > the ManifoldCF platform as a web crawler for Elasticsearch but have
> > > plans
> > > > > to expand it work with some of the other platforms that we work
> with.
> > > > >
> > > > >
> > > > >
> > > > > I am planning on contributing to the code base  and potentially
> > > offering
> > > > > commercial support to our clients.  Would like to get your thoughts
> > on
> > > > that
> > > > > and where you think our assistance could be best utilized.
> > > > >
> > > > >
> > > > > Thanks!
> > > > >
> > > > >
> > > > > Michael
> > > > >
> > > >
> > >
> >
>


Re: Unexpected HTTP result code: -1: null

2019-08-16 Thread Michael Cizmar
Priya  - Was this right?

48GB and 1-Core Intel(R) Xeon(R) CPU

While not directly related to an out of memory issue, you should have more
cores allocated to ES.

On Fri, Aug 16, 2019 at 1:09 AM Priya Arora  wrote:

> *Existing Threads/connections configuration is :-*
>
> How many worker threads do you have? - 15 worker threads has been
> allocated(in properties.xml file).
> And the Tika Extractor connections -10 connections are defined.
>
> Is this suggested to reduce the number more.
> If not, what else can be a solution
>
> Thanks
> Priya
>
>
>
> On Wed, Aug 14, 2019 at 5:32 PM Karl Wright  wrote:
>
> > How many worker threads do you have?
> > Even if each worker thread is constrained in memory, and they should be,
> > you can easily cause things to run out of memory by giving too many
> worker
> > threads.  Another way to keep Tika's usage constrained would be to reduce
> > the number of Tika Extractor connections, because that effectively limits
> > the number of extractions that can be going on at the same time.
> >
> > Karl
> >
> >
> > On Wed, Aug 14, 2019 at 7:23 AM Priya Arora  wrote:
> >
> > > Yes , I am using Tika Extractor. And the version used for manifold is
> > 2.13.
> > > Also I am using postgres as database.
> > >
> > > I have 4 types of jobs
> > > One is accessing/re crawling data from a public site. Other three are
> > > accessing intranet site.
> > > Out of which two are giving me correct output-without any error and
> third
> > > one which is having data more than the other two , and  giving me this
> > > error.
> > >
> > > Is there any possibility with site accessibility issue. Can you please
> > > suggest some solution
> > > Thanks and regards
> > > Priya
> > >
> > > On Wed, Aug 14, 2019 at 3:11 PM Karl Wright 
> wrote:
> > >
> > > > I will need to know more.  Do you have the tika extractor in your
> > > > pipeline?  If so, what version of ManifoldCF are you using?  Tika has
> > had
> > > > bugs related to memory consumption in the past; the out of memory
> > > exception
> > > > may be coming from it and therefore a stack trace is critical to
> have.
> > > >
> > > > Alternatively, you can upgrade to the latest version of MCF (2.13)
> and
> > > that
> > > > has a newer version of Tika without those problem.  But you may need
> to
> > > get
> > > > the agents process more memory.
> > > >
> > > > Another possible cause is that you're using hsqldb in production.
> > HSQLDB
> > > > keeps all of its tables in memory.  If you have a large crawl, you do
> > not
> > > > want to use HSQLDB.
> > > >
> > > > Thanks,
> > > > Karl
> > > >
> > > >
> > > > On Wed, Aug 14, 2019 at 3:41 AM Priya Arora 
> > wrote:
> > > >
> > > > > Hi Karl,
> > > > >
> > > > > Manifold CF logs hints out me an error like :
> > > > > agents process ran out of memory - shutting down
> > > > > java.lang.OutOfMemoryError: Java heap space
> > > > >
> > > > > Also I have -Xms1024m ,-Xmx1024m memory allocated in
> > > > > start-options.env.unix, start-options.env.win file.
> > > > > Also Configuration:-
> > > > > 1) For Crawler server - 16 GB RAM and 8-Core Intel(R) Xeon(R) CPU
> > > E5-2660
> > > > > v3 @ 2.60GHz and
> > > > >
> > > > > 2) For Elasticsearch server - 48GB and 1-Core Intel(R) Xeon(R) CPU
> > > > E5-2660
> > > > > v3 @ 2.60GHz and i am using postgres as database.
> > > > >
> > > > > Can you please help me out, what to do in this case.
> > > > >
> > > > > Thanks
> > > > > Priya
> > > > >
> > > > >
> > > > > On Wed, Aug 14, 2019 at 12:33 PM Karl Wright 
> > > wrote:
> > > > >
> > > > > > The error occurs, I believe, as the result of basic connection
> > > > problems,
> > > > > > e.g. the connection is getting rejected.  You can find more
> > > information
> > > > > in
> > > > > > the simple history, and in the manifoldcf log.
> > > > > >
> > > > > > I would like to know the underlying cause, since the connector
> > should
> > > > be
> > > > > > resilient against errors of this kind.
> > > > > >
> > > > > > Karl
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 14, 2019, 1:46 AM Priya Arora 
> > > wrote:
> > > > > >
> > > > > > > Hi Karl,
> > > > > > >
> > > > > > > I have an web Repository connector(Seeds:- an intranet Site).,
> > and
> > > > job
> > > > > i
> > > > > > > son Production server.
> > > > > > >
> > > > > > > When i ran job on PROD, the job stops itself 2 times with and
> > > > > > error:Error:
> > > > > > > Unexpected HTTP result code: -1: null.
> > > > > > >
> > > > > > >
> > > > > > > Can you please provide me an idea, in which it happens so?
> > > > > > >
> > > > > > > Thanks and regards
> > > > > > > Priya Arora
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: ManifoldCF Resource Offering

2019-08-16 Thread Michael Cizmar
For better or worse, my company (MC+A <https://www.mcplusa.com/>) was know
for a period of time as a 'connector' company although we are a
consultancy.  What I'm thinking initially:

   1. Connector development for repositories like (effectively looking at
   what we have and migrating them to overtime ManifoldCF):
  1. box
  2. lithium
  3. jive
  4. confluence
  5. Sharepoint
  6. Salesforce
  7. ServiceNow
  8. OneDrive
   2. Output connector to
  1. Elastic App Search
  2. Elastic Enterprise Search
   3. Improving the end user experience
  1. Updating the documentation
  2. Providing some samples
   4. Bugs as we find them



On Fri, Aug 16, 2019 at 2:55 AM Furkan KAMACI 
wrote:

> Hi Michael,
>
> Thanks for your interest!
>
> Community is important for FOSS projects. It can both help to develop such
> improvements and use it as may need them.
>
> So, in my opinion you can share which parts do you want to improve/add and
> share it with us for both some can help you and need them.
>
> Kind Regards,
> Furkan KAMACI
>
> 16 Ağu 2019 Cum, saat 07:13 tarihinde Karl Wright 
> şunu
> yazdı:
>
> > Excellent!
> > Thanks for contacting us.  I hope we can help.
> >
> > Karl
> >
> > On Thu, Aug 15, 2019 at 9:17 PM Michael Cizmar <
> mich...@michaelcizmar.com>
> > wrote:
> >
> > > Hey Folks,
> > >
> > > I exchanged emails with Karl earlier this week, but I wanted to reach
> out
> > > to you to offer up some additional assistance for the ManifoldCF
> project.
> > > My firm has previously written about a dozen commercial connectors
> > > primarily targeted to the Google Search Appliance.  We are starting to
> > use
> > > the ManifoldCF platform as a web crawler for Elasticsearch but have
> plans
> > > to expand it work with some of the other platforms that we work with.
> > >
> > >
> > >
> > > I am planning on contributing to the code base  and potentially
> offering
> > > commercial support to our clients.  Would like to get your thoughts on
> > that
> > > and where you think our assistance could be best utilized.
> > >
> > >
> > > Thanks!
> > >
> > >
> > > Michael
> > >
> >
>


ManifoldCF Resource Offering

2019-08-15 Thread Michael Cizmar
Hey Folks,

I exchanged emails with Karl earlier this week, but I wanted to reach out
to you to offer up some additional assistance for the ManifoldCF project.
My firm has previously written about a dozen commercial connectors
primarily targeted to the Google Search Appliance.  We are starting to use
the ManifoldCF platform as a web crawler for Elasticsearch but have plans
to expand it work with some of the other platforms that we work with.



I am planning on contributing to the code base  and potentially offering
commercial support to our clients.  Would like to get your thoughts on that
and where you think our assistance could be best utilized.


Thanks!


Michael


Re: Reminder: August 31st is the next scheduled ManifoldCF release

2019-08-12 Thread Michael Cizmar
I think it's good to be on a cadence to release regardless of the amount
changed.  Additionally, the release of the Elastic output connector is
compelling.

On Mon, Aug 12, 2019 at 7:57 AM Karl Wright  wrote:

> I had hoped that we could finish the OpenText Content Service/Web Service
> connector by this release cycle but I do not think it will be finished.  So
> I suggest we go ahead with release plans.  It's a pretty light release I'm
> afraid.
>
> Thoughts?
> Karl
>


Re: Elastic Output Connector SSLException

2019-08-10 Thread Michael Cizmar
Update on this.  I increased the output connections from 10 to 20 and
removed the enablement of security on the repository and the issue has not
been repeated.



On Fri, Aug 9, 2019 at 10:23 AM Michael Cizmar 
wrote:

> Right.  Specifically, this is occurring where the output connector
> attempts to write to the stream and the client-side is dropping it.  I've
> worked this configuration in two other environments with 'similar' setups
> without this issue so I was curious if anyone had encountered this.
>
> I'll enable additional logging on the http commons and see where that gets.
>
> Thanks!
>
> 
> From: Karl Wright 
> Sent: Friday, August 9, 2019 8:37 AM
> To: dev 
> Subject: Re: Elastic Output Connector SSLException
>
> "Connection Reset" sounds like something in the server's SSL configuration
> is dropping the connection because it doesn't like the protocol that was
> negotiated.  This might be a heavy-handed way of addressing security issues
> that arose with some ciphers used in SSL a year or two ago, not sure.
> MCF's use of SSL doesn't disable the deprecated ciphers; it will use them
> if that is what is negotiated.
>
> This is going to require more debugging help than this list can probably
> provide.
>
> Karl
>
>
> On Fri, Aug 9, 2019 at 9:26 AM Michael Cizmar 
> wrote:
>
> > Curious if anyone has received this a connection exception on the elastic
> > output connector and/or had an idea for the root cause.
> >
> > Thanks,
> >
> > Michael
> >
> >
> >
> >
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> > ~[httpclient-4.5.6.jar:4.5.6]
> > at
> >
> >
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> > ~[httpclient-4.5.6.jar:4.5.6]
> > at
> >
> >
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> > ~[httpclient-4.5.6.jar:4.5.6]
> > at
> >
> >
> org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection$CallThread.run(ElasticSearchConnection.java:147)
> > ~[?:?]
> > Caused by: javax.net.ssl.SSLException: java.net.SocketException:
> Connection
> > reset
> > at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
> > ~[?:1.8.0_212]
> > at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946)
> > ~[?:1.8.0_212]
> > at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1903)
> > ~[?:1.8.0_212]
> > at
> > sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1867)
> > ~[?:1.8.0_212]
> > at
> > sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1812)
> > ~[?:1.8.0_212]
> > at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:128)
> > ~[?:1.8.0_212]
> >
>


Re: Elastic Output Connector SSLException

2019-08-09 Thread Michael Cizmar
Right.  Specifically, this is occurring where the output connector attempts to 
write to the stream and the client-side is dropping it.  I've worked this 
configuration in two other environments with 'similar' setups without this 
issue so I was curious if anyone had encountered this.

I'll enable additional logging on the http commons and see where that gets.

Thanks!


From: Karl Wright 
Sent: Friday, August 9, 2019 8:37 AM
To: dev 
Subject: Re: Elastic Output Connector SSLException

"Connection Reset" sounds like something in the server's SSL configuration
is dropping the connection because it doesn't like the protocol that was
negotiated.  This might be a heavy-handed way of addressing security issues
that arose with some ciphers used in SSL a year or two ago, not sure.
MCF's use of SSL doesn't disable the deprecated ciphers; it will use them
if that is what is negotiated.

This is going to require more debugging help than this list can probably
provide.

Karl


On Fri, Aug 9, 2019 at 9:26 AM Michael Cizmar 
wrote:

> Curious if anyone has received this a connection exception on the elastic
> output connector and/or had an idea for the root cause.
>
> Thanks,
>
> Michael
>
>
>
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> ~[httpclient-4.5.6.jar:4.5.6]
> at
>
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> ~[httpclient-4.5.6.jar:4.5.6]
> at
>
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> ~[httpclient-4.5.6.jar:4.5.6]
> at
>
> org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection$CallThread.run(ElasticSearchConnection.java:147)
> ~[?:?]
> Caused by: javax.net.ssl.SSLException: java.net.SocketException: Connection
> reset
> at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
> ~[?:1.8.0_212]
> at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946)
> ~[?:1.8.0_212]
> at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1903)
> ~[?:1.8.0_212]
> at
> sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1867)
> ~[?:1.8.0_212]
> at
> sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1812)
> ~[?:1.8.0_212]
> at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:128)
> ~[?:1.8.0_212]
>


Elastic Output Connector SSLException

2019-08-09 Thread Michael Cizmar
Curious if anyone has received this a connection exception on the elastic
output connector and/or had an idea for the root cause.

Thanks,

Michael


org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection$CallThread.run(ElasticSearchConnection.java:147)
~[?:?]
Caused by: javax.net.ssl.SSLException: java.net.SocketException: Connection
reset
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
~[?:1.8.0_212]
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946)
~[?:1.8.0_212]
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1903)
~[?:1.8.0_212]
at
sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1867)
~[?:1.8.0_212]
at
sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1812)
~[?:1.8.0_212]
at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:128)
~[?:1.8.0_212]


[jira] [Commented] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-31 Thread Michael Cizmar (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897375#comment-16897375
 ] 

Michael Cizmar commented on CONNECTORS-1615:


Roger that.  Please forgive my ignorance, is that a pull request to github or 
how would I suggest a change?   

> Bad Error Message when IDCOLUMN's value is actually null
> 
>
> Key: CONNECTORS-1615
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>    Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> In the edge case that the id column is null, the error message doesn't 
> suggest that.
>  
> {code:java}
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>   if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return 
> $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
> \"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
> connection.");
>   String idValue = JDBCConnection.readAsString(o);
> {code}
>  
>  
> Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-31 Thread Michael Cizmar (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897370#comment-16897370
 ] 

Michael Cizmar commented on CONNECTORS-1615:


Which doesn't explain why the logging can't be changed.

and couldn't a prepare statement be executed to verify the query?

> Bad Error Message when IDCOLUMN's value is actually null
> 
>
> Key: CONNECTORS-1615
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>    Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> In the edge case that the id column is null, the error message doesn't 
> suggest that.
>  
> {code:java}
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>   if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return 
> $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
> \"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
> connection.");
>   String idValue = JDBCConnection.readAsString(o);
> {code}
>  
>  
> Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-31 Thread Michael Cizmar (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897158#comment-16897158
 ] 

Michael Cizmar commented on CONNECTORS-1615:


Right.   However, the case is:
 # That processing on the job stops when a single null is returned from the 
result set
 # The error message misinforms the user to the structure of the query and not 
the result

So ManifoldCF could handle 99+% of the result set but fails due to one bad 
record.  This record could come in after the initial configuration.  As I said, 
the message does not direct the user to the root problem.  What I had to do was 
look up the code to see that there was in fact a null check and then find out 
that in row 40k+ there was a null.

The message should say explicitly what failed.  "Null was returned for identity 
column, bad seed query"

 

 

> Bad Error Message when IDCOLUMN's value is actually null
> 
>
> Key: CONNECTORS-1615
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> In the edge case that the id column is null, the error message doesn't 
> suggest that.
>  
> {code:java}
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>   if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return 
> $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
> \"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
> connection.");
>   String idValue = JDBCConnection.readAsString(o);
> {code}
>  
>  
> Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-30 Thread Michael Cizmar (JIRA)
Michael Cizmar created CONNECTORS-1615:
--

 Summary: Bad Error Message when IDCOLUMN's value is actually null
 Key: CONNECTORS-1615
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
 Project: ManifoldCF
  Issue Type: Bug
  Components: JDBC connector
Affects Versions: ManifoldCF 2.13
Reporter: Michael Cizmar


In the edge case that the id column is null, the error message doesn't suggest 
that.

 
{code:java}
Object o = row.getValue(JDBCConstants.idReturnColumnName);
if (o == null)
throw new ManifoldCFException("Bad seed query; doesn't return 
$(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
\"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
connection.");
String idValue = JDBCConnection.readAsString(o);
{code}
 

 

Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1519) CLIENTPROTOCOLEXCEPTION is thrown with 2.10 -> ES 6.x.y

2019-06-24 Thread Michael Cizmar (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871075#comment-16871075
 ] 

Michael Cizmar commented on CONNECTORS-1519:


Cool.  Using something like Pre-emptive at the very least improves performance. 
 I'll look at your diff.  Thanks!

> CLIENTPROTOCOLEXCEPTION   is thrown with 2.10 -> ES 6.x.y
> ---
>
> Key: CONNECTORS-1519
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1519
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: ElasticSearchConnection.diff
>
>
> Investigating CLIENTPROTOCOLEXCEPTION when using 2.10 with ES 6.x.y
> More information to follow.
> Fails when using security , i.e. 
> [http://user:password@elasticsearch:9200.|http://user:password@elasticsearch:9200./]
> Remedy:
>  # Disable x-pack security.
>  # Use http://elasticsearch:9200.
>  
>  
> |07-27-2018 17:53:19.010|Indexation 
> (ES)|file:/var/manifoldcf/corpus/14.html|CLIENTPROTOCOLEXCEPTION|38053|23|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)