Re: Open CMIS 0.11 + Alfresco 4.2 Community Edition Question

2014-08-12 Thread Joshy Augustine
Hi Peter&Sebastian,

Thanks again for your help.

I did try moving to Lucene as per the instructions found in
http://benjaminbaka.wordpress.com/2013/05/23/moving-from-solr-to-lucene-in-alfresco-4-0-e-full-reindex/.
(@Sebastian, had to use this link since I am on Alfresco 4.2)

I did not get the original query issue ever since. Hence, from that
perspective, it looked like a step forward. However, I faced another
issue(see below..possibly a configuration issue?) ever since, couldn't get
my head around it, and hence went back to 20 seconds delay + SOLR interim
approach in my test code.

@Peter, I do see your point that even using Lucene, I could get the same
issue. I may not be able to stick to Alfresco MDQ guidelines since my
use-case is an adhoc search use-case that should return results that are
scoped under a folder tree(hence IN_TREE() clause is mandatory?). Perhaps,
in the real-world use-cases, it is acceptable that for a very short period
of time, the query might not return the full result-set and encourage the
user to retry the query if he feels that a document was added just now.

*Details on the issue I faced on Lucene set-up*

The last line my test program performs a clean-up of all folders created
via the following API call.


folder = (Folder) session.getObject(id, context);
folder.deleteTree(*true*, UnfileObject.*DELETE*, *true*);

Ever since I made the switch to Lucene, I am getting a
CmisObjectNotFoundException for one of the folders(the others gets deleted
without throwing any exception) I am trying to delete. The same exception
happens from both CMISWorkbench as well as my test code. Eventhough the
exception is thrown, the folder tree is deleted. If I try to delete the
folder via Alfresco Web UI, it gets deleted fine.



Thanks,
Joshy




On Tue, Aug 12, 2014 at 4:39 PM, Peter Monks 
wrote:

> Switching to Lucene is only a partial solution, since full text indexing
> (which is used by the CONTAINS predicate) is always asynchronous.  It's
> also possible to configure Lucene indexing to be fully asynchronous (like
> SOLR), although if the target Alfresco server is under your control this
> may not be an issue in your case.  The MDQ capability in 4.2+ is a better
> choice, if you're able to stick to MDQ-compatible queries.
>
> Generally speaking this is not a problem in practice, provided one Is
> aware of the behaviour and adjusts for it (e.g. by not assuming query is
> transaction ally consistent, by using folder listing / path or id lookups
> wherever possible, etc.).
>
> Cheers,
> Peter
>
> Apologes for speling & gramar erorrs - sent from mobil deivce
>
> > On Aug 12, 2014, at 2:40 AM, "Sebastian Danninger" <
> [email protected]> wrote:
> >
> > Hi Joshy,
> >
> > try to change Alfresco to Lucene indexing instead of Solr then you wont
> > have the delay. We had exactly the same problem.
> >
> > This URL describes it almost (some files that you should delete are not
> > available):
> >
> http://deepak-keswani.blogspot.co.at/2012/12/how-to-disable-solr-enable-lucene-on.html
> >
> > all the best
> >
> > Sebastian
> >
> >
> > 2014-08-12 11:00 GMT+02:00 Joshy Augustine :
> >
> >> Hi Peter,
> >>
> >> Many thanks for your reply. Based on your explanation, I did some more
> >> experiments. I think the results prove that you are right.
> >>
> >> 1) I put a breakpoint just before session.query(), then used CMIS
> workbench
> >> to verify that the query returned results, then stepped over the
> >> breakpoint. I found that session.query() now returned correct results.
> >>
> >> 2) Your explanation of the issue and the above experiment seemed to
> suggest
> >> that, if I add more time between adding documents to a folder and
> execution
> >> of the query, probably it should return correct results. By trial and
> error
> >> I found that if I add 20 seconds between adding of a document and
> execution
> >> of the query via Open CMIS API, I get the correct results 10 out of 10
> >> trials.
> >>
> >> I assume the above behaviour(ie, alfresco asynchronous indexing causing
> >> incorrect query results for a very short period of time) is not an
> issue in
> >> real-world situations? Or is there any way to address this?
> >>
> >> Thanks,
> >> Joshy
> >>
> >>
> >>
> >>
> >> On Mon, Aug 11, 2014 at 9:05 PM, Peter Monks 
> >> wrote:
> >>
> >>> G’day Josh,
> >>>
> >>> One thing to watch out for with Alfresco is that query is “eventually
> >>> consistent” - indexing of new / updated content is done asynchronously
> >> and
> >>> those indexes are used by the query engine.  As of Alfresco 4.2
> there’s a
> >>> metadata query capability<
> >>> http://wiki.alfresco.com/wiki/Alfresco_Community_4.2#Metadata_Query>
> [1]
> >>> that can be enabled to allow some (but not all) queries to be run in a
> >>> transactionally consistent fashion, although note that the query below
> >>> doesn’t meet the requirements to be executed as a metadata query (due
> to
> >>> the LIKE and IN_TREE clauses).
> >>>
> >>> That sai

Re: Open CMIS 0.11 + Alfresco 4.2 Community Edition Question

2014-08-12 Thread Peter Monks
Switching to Lucene is only a partial solution, since full text indexing (which 
is used by the CONTAINS predicate) is always asynchronous.  It's also possible 
to configure Lucene indexing to be fully asynchronous (like SOLR), although if 
the target Alfresco server is under your control this may not be an issue in 
your case.  The MDQ capability in 4.2+ is a better choice, if you're able to 
stick to MDQ-compatible queries.

Generally speaking this is not a problem in practice, provided one Is aware of 
the behaviour and adjusts for it (e.g. by not assuming query is transaction 
ally consistent, by using folder listing / path or id lookups wherever 
possible, etc.).

Cheers,
Peter

Apologes for speling & gramar erorrs - sent from mobil deivce

> On Aug 12, 2014, at 2:40 AM, "Sebastian Danninger" 
>  wrote:
> 
> Hi Joshy,
> 
> try to change Alfresco to Lucene indexing instead of Solr then you wont
> have the delay. We had exactly the same problem.
> 
> This URL describes it almost (some files that you should delete are not
> available):
> http://deepak-keswani.blogspot.co.at/2012/12/how-to-disable-solr-enable-lucene-on.html
> 
> all the best
> 
> Sebastian
> 
> 
> 2014-08-12 11:00 GMT+02:00 Joshy Augustine :
> 
>> Hi Peter,
>> 
>> Many thanks for your reply. Based on your explanation, I did some more
>> experiments. I think the results prove that you are right.
>> 
>> 1) I put a breakpoint just before session.query(), then used CMIS workbench
>> to verify that the query returned results, then stepped over the
>> breakpoint. I found that session.query() now returned correct results.
>> 
>> 2) Your explanation of the issue and the above experiment seemed to suggest
>> that, if I add more time between adding documents to a folder and execution
>> of the query, probably it should return correct results. By trial and error
>> I found that if I add 20 seconds between adding of a document and execution
>> of the query via Open CMIS API, I get the correct results 10 out of 10
>> trials.
>> 
>> I assume the above behaviour(ie, alfresco asynchronous indexing causing
>> incorrect query results for a very short period of time) is not an issue in
>> real-world situations? Or is there any way to address this?
>> 
>> Thanks,
>> Joshy
>> 
>> 
>> 
>> 
>> On Mon, Aug 11, 2014 at 9:05 PM, Peter Monks 
>> wrote:
>> 
>>> G’day Josh,
>>> 
>>> One thing to watch out for with Alfresco is that query is “eventually
>>> consistent” - indexing of new / updated content is done asynchronously
>> and
>>> those indexes are used by the query engine.  As of Alfresco 4.2 there’s a
>>> metadata query capability<
>>> http://wiki.alfresco.com/wiki/Alfresco_Community_4.2#Metadata_Query> [1]
>>> that can be enabled to allow some (but not all) queries to be run in a
>>> transactionally consistent fashion, although note that the query below
>>> doesn’t meet the requirements to be executed as a metadata query (due to
>>> the LIKE and IN_TREE clauses).
>>> 
>>> That said, if this were indeed Alfresco’s eventually consistent behaviour
>>> I would expect both your client application and the CMIS Workbench to
>>> exhibit the same (or similar) results.  The indexes are a global
>> resource,
>>> so it’s difficult to imagine a case where the two clients would continue
>> to
>>> return inconsistent results for a lengthy period of time (unless, of
>>> course, you’re authenticated as different users - then it could be
>>> explained by ACLs).
>>> 
>>> At this point it’s hard to tell where the issue might lie (OpenCMIS vs
>>> custom code vs CMIS Workbench vs Alfresco server), so I’ve cc’ed the
>>> alfresco-technical-discussion google group - hopefully between the two
>>> groups we’ll be able to narrow down the possibilities.
>>> 
>>> Cheers,
>>> Peter
>>> 
>>> [1] http://wiki.alfresco.com/wiki/Alfresco_Community_4.2#Metadata_Query
>>> 
>>> 
>>> On 2014-08-11, at 9:13 AM, Joshy Augustine >> > wrote:
>>> 
>>> Hi All,
>>> 
>>> I am a newbie to Open CMIS. I am very fascinated by Open CMIS framework
>>> that allowed me to write(almost in no time) a sample test program that
>>> interacts with a various ECM Vendors. However, I am encountering a
>> strange
>>> issue with Alfresco and wondered whether any of you had the time to help?
>>> 
>>> In the sample code I am developing, I perform the following actions
>>> 
>>> 1) Create a Folder in Alfresco
>>> 2) Add  a few documents to it
>>> 3) Search (via CMIS Query) for the list of documents in a folder tree
>> that
>>> matches with search criteria.
>>> 
>>> In step 3, I execute the following API
>>> 
>>> 
>>> ItemIterable queryResult = session.query(complete_statement,
>>> *false*,operationContext);
>>> 
>>> Iterator iterator = queryResult.iterator();
>>> 
>>> *while* (iterator.hasNext())
>>> 
>>> {
>>> 
>>> QueryResult qr = (QueryResult) iterator.next();
>>> 
>>> String id = (String) qr.getPropertyByQueryName("cmis:objectId"
>>> ).getFirstValue());
>>> 
>>> }
>>> 
>>> that results in th

Re: Open CMIS 0.11 + Alfresco 4.2 Community Edition Question

2014-08-12 Thread Sebastian Danninger
Hi Joshy,

try to change Alfresco to Lucene indexing instead of Solr then you wont
have the delay. We had exactly the same problem.

This URL describes it almost (some files that you should delete are not
available):
http://deepak-keswani.blogspot.co.at/2012/12/how-to-disable-solr-enable-lucene-on.html

all the best

Sebastian


2014-08-12 11:00 GMT+02:00 Joshy Augustine :

> Hi Peter,
>
> Many thanks for your reply. Based on your explanation, I did some more
> experiments. I think the results prove that you are right.
>
> 1) I put a breakpoint just before session.query(), then used CMIS workbench
> to verify that the query returned results, then stepped over the
> breakpoint. I found that session.query() now returned correct results.
>
> 2) Your explanation of the issue and the above experiment seemed to suggest
> that, if I add more time between adding documents to a folder and execution
> of the query, probably it should return correct results. By trial and error
> I found that if I add 20 seconds between adding of a document and execution
> of the query via Open CMIS API, I get the correct results 10 out of 10
> trials.
>
> I assume the above behaviour(ie, alfresco asynchronous indexing causing
> incorrect query results for a very short period of time) is not an issue in
> real-world situations? Or is there any way to address this?
>
> Thanks,
> Joshy
>
>
>
>
> On Mon, Aug 11, 2014 at 9:05 PM, Peter Monks 
> wrote:
>
> > G’day Josh,
> >
> > One thing to watch out for with Alfresco is that query is “eventually
> > consistent” - indexing of new / updated content is done asynchronously
> and
> > those indexes are used by the query engine.  As of Alfresco 4.2 there’s a
> > metadata query capability<
> > http://wiki.alfresco.com/wiki/Alfresco_Community_4.2#Metadata_Query> [1]
> > that can be enabled to allow some (but not all) queries to be run in a
> > transactionally consistent fashion, although note that the query below
> > doesn’t meet the requirements to be executed as a metadata query (due to
> > the LIKE and IN_TREE clauses).
> >
> > That said, if this were indeed Alfresco’s eventually consistent behaviour
> > I would expect both your client application and the CMIS Workbench to
> > exhibit the same (or similar) results.  The indexes are a global
> resource,
> > so it’s difficult to imagine a case where the two clients would continue
> to
> > return inconsistent results for a lengthy period of time (unless, of
> > course, you’re authenticated as different users - then it could be
> > explained by ACLs).
> >
> > At this point it’s hard to tell where the issue might lie (OpenCMIS vs
> > custom code vs CMIS Workbench vs Alfresco server), so I’ve cc’ed the
> > alfresco-technical-discussion google group - hopefully between the two
> > groups we’ll be able to narrow down the possibilities.
> >
> > Cheers,
> > Peter
> >
> > [1] http://wiki.alfresco.com/wiki/Alfresco_Community_4.2#Metadata_Query
> >
> >
> > On 2014-08-11, at 9:13 AM, Joshy Augustine  > > wrote:
> >
> > Hi All,
> >
> > I am a newbie to Open CMIS. I am very fascinated by Open CMIS framework
> > that allowed me to write(almost in no time) a sample test program that
> > interacts with a various ECM Vendors. However, I am encountering a
> strange
> > issue with Alfresco and wondered whether any of you had the time to help?
> >
> > In the sample code I am developing, I perform the following actions
> >
> > 1) Create a Folder in Alfresco
> > 2) Add  a few documents to it
> > 3) Search (via CMIS Query) for the list of documents in a folder tree
> that
> > matches with search criteria.
> >
> > In step 3, I execute the following API
> >
> >
> > ItemIterable queryResult = session.query(complete_statement,
> > *false*,operationContext);
> >
> > Iterator iterator = queryResult.iterator();
> >
> > *while* (iterator.hasNext())
> >
> > {
> >
> > QueryResult qr = (QueryResult) iterator.next();
> >
> > String id = (String) qr.getPropertyByQueryName("cmis:objectId"
> > ).getFirstValue());
> >
> > }
> >
> > that results in the following query being executed.
> >
> > *SELECT cmis:objectId FROM cmis:document WHERE cmis:name like 'Hello
> Wor%'
> > AND IN_TREE('c4714b61-2800-4995-8e37-2cc07549d4b2') ORDER BY
> > cmis:creationDate desc*
> > When I run the test program, most of the times, this query does not
> return
> > any results. If i put a breakpoint in the line session.query() and
> execute
> > the statement using CMIS Workbench, I am able to find results.
> > Sometimes(but not always) even putting a Thread.sleep() before
> > session.query() allows me to find the documents that query should have
> > found.
> >
> >
> > Any idea what is the best way to debug this issue?
> >
> >
> > Cheers,
> > Josh
> >
> >
>
>
> --
> Cheers,
> Josh
>


Re: Open CMIS 0.11 + Alfresco 4.2 Community Edition Question

2014-08-12 Thread Joshy Augustine
Hi Peter,

Many thanks for your reply. Based on your explanation, I did some more
experiments. I think the results prove that you are right.

1) I put a breakpoint just before session.query(), then used CMIS workbench
to verify that the query returned results, then stepped over the
breakpoint. I found that session.query() now returned correct results.

2) Your explanation of the issue and the above experiment seemed to suggest
that, if I add more time between adding documents to a folder and execution
of the query, probably it should return correct results. By trial and error
I found that if I add 20 seconds between adding of a document and execution
of the query via Open CMIS API, I get the correct results 10 out of 10
trials.

I assume the above behaviour(ie, alfresco asynchronous indexing causing
incorrect query results for a very short period of time) is not an issue in
real-world situations? Or is there any way to address this?

Thanks,
Joshy




On Mon, Aug 11, 2014 at 9:05 PM, Peter Monks 
wrote:

> G’day Josh,
>
> One thing to watch out for with Alfresco is that query is “eventually
> consistent” - indexing of new / updated content is done asynchronously and
> those indexes are used by the query engine.  As of Alfresco 4.2 there’s a
> metadata query capability<
> http://wiki.alfresco.com/wiki/Alfresco_Community_4.2#Metadata_Query> [1]
> that can be enabled to allow some (but not all) queries to be run in a
> transactionally consistent fashion, although note that the query below
> doesn’t meet the requirements to be executed as a metadata query (due to
> the LIKE and IN_TREE clauses).
>
> That said, if this were indeed Alfresco’s eventually consistent behaviour
> I would expect both your client application and the CMIS Workbench to
> exhibit the same (or similar) results.  The indexes are a global resource,
> so it’s difficult to imagine a case where the two clients would continue to
> return inconsistent results for a lengthy period of time (unless, of
> course, you’re authenticated as different users - then it could be
> explained by ACLs).
>
> At this point it’s hard to tell where the issue might lie (OpenCMIS vs
> custom code vs CMIS Workbench vs Alfresco server), so I’ve cc’ed the
> alfresco-technical-discussion google group - hopefully between the two
> groups we’ll be able to narrow down the possibilities.
>
> Cheers,
> Peter
>
> [1] http://wiki.alfresco.com/wiki/Alfresco_Community_4.2#Metadata_Query
>
>
> On 2014-08-11, at 9:13 AM, Joshy Augustine  > wrote:
>
> Hi All,
>
> I am a newbie to Open CMIS. I am very fascinated by Open CMIS framework
> that allowed me to write(almost in no time) a sample test program that
> interacts with a various ECM Vendors. However, I am encountering a strange
> issue with Alfresco and wondered whether any of you had the time to help?
>
> In the sample code I am developing, I perform the following actions
>
> 1) Create a Folder in Alfresco
> 2) Add  a few documents to it
> 3) Search (via CMIS Query) for the list of documents in a folder tree that
> matches with search criteria.
>
> In step 3, I execute the following API
>
>
> ItemIterable queryResult = session.query(complete_statement,
> *false*,operationContext);
>
> Iterator iterator = queryResult.iterator();
>
> *while* (iterator.hasNext())
>
> {
>
> QueryResult qr = (QueryResult) iterator.next();
>
> String id = (String) qr.getPropertyByQueryName("cmis:objectId"
> ).getFirstValue());
>
> }
>
> that results in the following query being executed.
>
> *SELECT cmis:objectId FROM cmis:document WHERE cmis:name like 'Hello Wor%'
> AND IN_TREE('c4714b61-2800-4995-8e37-2cc07549d4b2') ORDER BY
> cmis:creationDate desc*
> When I run the test program, most of the times, this query does not return
> any results. If i put a breakpoint in the line session.query() and execute
> the statement using CMIS Workbench, I am able to find results.
> Sometimes(but not always) even putting a Thread.sleep() before
> session.query() allows me to find the documents that query should have
> found.
>
>
> Any idea what is the best way to debug this issue?
>
>
> Cheers,
> Josh
>
>


-- 
Cheers,
Josh


Re: Open CMIS 0.11 + Alfresco 4.2 Community Edition Question

2014-08-11 Thread Peter Monks
G’day Josh,

One thing to watch out for with Alfresco is that query is “eventually 
consistent” - indexing of new / updated content is done asynchronously and 
those indexes are used by the query engine.  As of Alfresco 4.2 there’s a 
metadata query 
capability 
[1] that can be enabled to allow some (but not all) queries to be run in a 
transactionally consistent fashion, although note that the query below doesn’t 
meet the requirements to be executed as a metadata query (due to the LIKE and 
IN_TREE clauses).

That said, if this were indeed Alfresco’s eventually consistent behaviour I 
would expect both your client application and the CMIS Workbench to exhibit the 
same (or similar) results.  The indexes are a global resource, so it’s 
difficult to imagine a case where the two clients would continue to return 
inconsistent results for a lengthy period of time (unless, of course, you’re 
authenticated as different users - then it could be explained by ACLs).

At this point it’s hard to tell where the issue might lie (OpenCMIS vs custom 
code vs CMIS Workbench vs Alfresco server), so I’ve cc’ed the 
alfresco-technical-discussion google group - hopefully between the two groups 
we’ll be able to narrow down the possibilities.

Cheers,
Peter

[1] http://wiki.alfresco.com/wiki/Alfresco_Community_4.2#Metadata_Query


On 2014-08-11, at 9:13 AM, Joshy Augustine 
mailto:[email protected]>> wrote:

Hi All,

I am a newbie to Open CMIS. I am very fascinated by Open CMIS framework
that allowed me to write(almost in no time) a sample test program that
interacts with a various ECM Vendors. However, I am encountering a strange
issue with Alfresco and wondered whether any of you had the time to help?

In the sample code I am developing, I perform the following actions

1) Create a Folder in Alfresco
2) Add  a few documents to it
3) Search (via CMIS Query) for the list of documents in a folder tree that
matches with search criteria.

In step 3, I execute the following API


ItemIterable queryResult = session.query(complete_statement,
*false*,operationContext);

Iterator iterator = queryResult.iterator();

*while* (iterator.hasNext())

{

QueryResult qr = (QueryResult) iterator.next();

String id = (String) qr.getPropertyByQueryName("cmis:objectId"
).getFirstValue());

}

that results in the following query being executed.

*SELECT cmis:objectId FROM cmis:document WHERE cmis:name like 'Hello Wor%'
AND IN_TREE('c4714b61-2800-4995-8e37-2cc07549d4b2') ORDER BY
cmis:creationDate desc*
When I run the test program, most of the times, this query does not return
any results. If i put a breakpoint in the line session.query() and execute
the statement using CMIS Workbench, I am able to find results.
Sometimes(but not always) even putting a Thread.sleep() before
session.query() allows me to find the documents that query should have
found.


Any idea what is the best way to debug this issue?


Cheers,
Josh