[jira] [Commented] (UIMA-1524) JFSIndexRepository should be enhanced with new generic methods

2016-09-16 Thread Richard Eckart de Castilho (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497176#comment-15497176
 ] 

Richard Eckart de Castilho commented on UIMA-1524:
--

2) can we include typePriorities() and typePriorities(boolean)? I think in some 
cases it may be more convenient to pass a boolean then to change to code flow 
to exclude this call.

4) I'm for special casing that. We could ask people on the list, but I never 
had anybody complaining about this behavior. A builder like "includeContext()" 
or "includeOrigin()" could be added to disable this special case.

5) how about using skip(int) to handle offsets?

6) not entirely sure what you are asking. uimaFIT definitely has the limit arg, 
but not the skip/offset arg. Not sure if we need it, but also wouldn't object 
having it. [~pkluegl] do you need something like this in Ruta?

7) didn't understand that. If I call reverse(), then I would expect that I 
continue to move the cursor into the opposite direction afterwards. I guess 
what you are saying is that reverse() changes the order of the list but not the 
direction in which the cursor moves. Ok, makes sense... so we should then have 
a preceeding verb.

8) single() is meant to throw an exception in the case where not exactly one 
instance is available. UIMA-3234 suggest that two different exceptions should 
be thrown depending on whether the instance is missing or whether there is more 
than one instance. A get method with a positional argument does not cover these 
conditions. single() is meant to be used with singleton FSes such as 
DocumentMetaData (yes, I know that there is a dedicated getter for this in the 
CAS, but it requires typecasting if a user has created a custom subclass of it 
- also there may be additional singleton annotations in the CAS).


> JFSIndexRepository should be enhanced with new generic methods
> --
>
> Key: UIMA-1524
> URL: https://issues.apache.org/jira/browse/UIMA-1524
> Project: UIMA
>  Issue Type: Improvement
>  Components: Core Java Framework
>Affects Versions: 2.3
>Reporter: Joern Kottmann
>
> Existing methods should be overloaded with an additional Class argument to 
> specify the exact return type. This changes make down casting of returned 
> objects unnecessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-1524) JFSIndexRepository should be enhanced with new generic methods

2016-09-16 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497451#comment-15497451
 ] 

Marshall Schor commented on UIMA-1524:
--

A partial reply - more later...

(5) and (7) both have an aspect of words being "declarative" vs "imperative" 
sense.  Declarative is (to my mind) more in keeping with the lazy approach to 
functional paradigm embodied in streams, where nothing happens until you get to 
the "terminal" operation. The declarative says more what is to happen, without 
specifying how.  The imperative is more of a how statement.

The term "skip" sounds like an imperative action.  The term "offset" sounds (to 
me) more declarative.  I, however, completely agree that it's confusing due to 
its other use within UIMA.  So need to find another "declarative" term (if 
possible). Maybe fudgeFactor( + - nnn) ?  (offered in the spirit of 
brainstorming - might be silly, but might trigger a good thing :-) )

The term "reverse" is more like an action.  The declarative term might be 
iterate-toward-end-of-index, or iterate-toward-beginning-of-index.  Of course, 
these are too long... However, ignoring that, I think it would be intuitive 
that iterate-toward-end-of-index().iterate-toward-end-of-index() would not be a 
no-op.

When you have semantics like reverse().reverse() being a no-op, yes, it makes 
sense in terms of imperative actions, but it requires the human reader of the 
code to "play computer" and keep some state variables, in order to figure out 
what the meaning of a bit of code is.  As my brain grows feebler (!), I find I 
prefer approaches which don't require this :-), but instead, always simply mean 
just what they say.

So, I'd like to find a declarative term expressing iteration direction - 
towardFirst, towardLast, or ??

> JFSIndexRepository should be enhanced with new generic methods
> --
>
> Key: UIMA-1524
> URL: https://issues.apache.org/jira/browse/UIMA-1524
> Project: UIMA
>  Issue Type: Improvement
>  Components: Core Java Framework
>Affects Versions: 2.3
>Reporter: Joern Kottmann
>
> Existing methods should be overloaded with an additional Class argument to 
> specify the exact return type. This changes make down casting of returned 
> objects unnecessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: uv3 iterators - success in avoiding all concurrent modification exceptions

2016-09-16 Thread Marshall Schor
As an experiment, I implemented a copy-on-write style of concurrent modification
exception prevention in UV3.

It does minimal copying, only copying part of the index related to the
particular type being updated; if no iterators are in use, there's no copying
(but see below).

The copy is done just once, even for multiple iterators, unless a subsequent
iterator is created after another update has happened to that part of the index.

With this, you get a trade-off: no more concurrent modification exceptions; you
can modify indexes within loops, but (incrementally) copies are made of index
parts if needed.  So it takes more space and time, due to copies sometimes being
made.

In the following case, no copies will be made:

  a) modify the indexes

  b) create an iterator, iterate, then drop references to the iterator, and have
the garbage collector gc it.

  c) repeat a and b as much as you like.

If you're through with an iterator, but it hasn't been GC'd yet, then the
modification code can't tell your through with the iterator, and has to make a 
copy.

Is this a good trade off to make?  Should we have 2 modes of running pipelines -
with/without this feature?

-Marshall

P.S. there's an edge case caught by the test cases.  In today's world, if you 
do:
   a) modify the indexes
   b) start iterating
   c) modify the indexes
   d) do one of moveToFirst, Last, or just moveTo(fs), these "reset" the
concurrent mod, and allow continuing use of the iterator, this time over the
updated indexes.  I had to add some more details in the impl to make this work
the same way... 

On 9/14/2016 10:11 AM, Marshall Schor wrote:
> Version 2 had snapshot iterators, used for two purposes:
>
> a) allowing underlying index modifications while iterating (over the 
> snapshot).
> Note that this includes even simple things like changing begin/end values in 
> an
> annotation (which could cause a remove/add-back to indexes action while those
> features are changed).
>
> b) performance (in some edge cases, but also has a performance cost initially
> (to create the snapshot))
>
> It might be reasonable to support case (a) more automatically.  One approach
> might be to do a "copy on write" style for the index parts.  Java has, for
> instance CopyOnWriteArrayList and CopyOnWriteArraySet.  This could add 1 more
> level of indirection in using UIMA indexes; details need to be worked out and
> could be complex (indexes need to be performant and thread-safe for reading).
>
> Does this seem like a good thing to try?
>
> -Marshall
>
>



[jira] [Commented] (UIMA-1524) JFSIndexRepository should be enhanced with new generic methods

2016-09-16 Thread Richard Eckart de Castilho (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497345#comment-15497345
 ] 

Richard Eckart de Castilho commented on UIMA-1524:
--

2) ok, nice

4) +1

5) in my mind "offset" is quite strongly tied to the begin/end character 
offsets, so I find it more attractive to use another them for skipping/seeking 
in the index. What makes you prefer the argument form over the verb?

6) in uimaFIT, presently selectFollowing and selectPreceding both return the 
annotations in index order. I don't have a strong opinion about selectPreceding 
returning in reverse index order. Actually, I was my intuition that it would 
return in reverse order and I had to look up the source code to figure out it 
was using index order.

7) I don't understand why reverse().reverse() should not be a no-op - if it is 
not a no-op, then what is it? IMHO a positive offset/seek/skip should always go 
into iteration direction and a negative should go opposite to the iteration 
direction. I believe it would utterly confuse me if the offset/seek/skip would 
not follow the current iteration direction.

8) it could also be an option to have get() return null if there is no instance 
and throw an exception only if there is more than one instance. Btw. do you 
fancy the use of Optional in this new API? I'm not particularly fond of it 
(yet), but it seems some people are.

> JFSIndexRepository should be enhanced with new generic methods
> --
>
> Key: UIMA-1524
> URL: https://issues.apache.org/jira/browse/UIMA-1524
> Project: UIMA
>  Issue Type: Improvement
>  Components: Core Java Framework
>Affects Versions: 2.3
>Reporter: Joern Kottmann
>
> Existing methods should be overloaded with an additional Class argument to 
> specify the exact return type. This changes make down casting of returned 
> objects unnecessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (UIMA-4210) Client hangs with more than 1 time-out

2016-09-16 Thread Jerry Cwiklik (JIRA)

 [ 
https://issues.apache.org/jira/browse/UIMA-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Cwiklik updated UIMA-4210:

Fix Version/s: 2.9.0AS

> Client hangs with more than 1 time-out
> --
>
> Key: UIMA-4210
> URL: https://issues.apache.org/jira/browse/UIMA-4210
> Project: UIMA
>  Issue Type: Bug
>  Components: Async Scaleout
>Affects Versions: 2.4.2AS
> Environment: Java 7, Mac OS
>Reporter: Frank Xu
>  Labels: client, hangs
> Fix For: 2.9.0AS
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> The client hangs if the execution has two time-outs. After debugging into the 
> issue, we figure out that the resending mechanism has some bugs in it. Here 
> are the detailed description.
> Please review the necessity for the invocation of sendCAS(). In our system, 
> we don't have to resend the CAS to process again. Please provide a 
> configuration so that we don't have to resend the CAS every time there is a 
> time out.
> Whenever there is the first time-out, 
> BaseUIMAAsynchronousEngine_impl#notifyOnTimout() is invoked and it hangs when 
> it tries to invoke sendCas() at line 2385. I believe the reason is that the 
> sendCAS() is a synchronized method and a potential threading issue causes 
> this thread hang over there. Please be noted that this block is also 
> synchronized.
> Then when there is a second time-out, it will be hanging in the very 
> beginning of the method notifyOnTimeout() and cannot clear the time-out CAS 
> from the CAS list, which hangs the entire client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-09-16 Thread Richard Eckart de Castilho (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497557#comment-15497557
 ] 

Richard Eckart de Castilho commented on UIMA-5106:
--

I though the ID property (it is not a feature as in Feature Structure) 
resembles the LowLevelCas address in v2, so I'm not exactly sure why this is 
considered to be a new feature.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: the hook from svn update to Jira has been restored, nothing was lost

2016-09-16 Thread Richard Eckart de Castilho
Great :) Thanks for following this through!

-- Richard

> On 16.09.2016, at 15:41, Marshall Schor  wrote:
> 
> See
> 
> https://issues.apache.org/jira/browse/INFRA-12551
> 
> The updates were not lost, it appears.
> 
> -Marshall


Build failed in Jenkins: UIMA-DUCC #988

2016-09-16 Thread Apache Jenkins Server
See 

Changes:

[degenaro] UIMA-5110 DUCC Job Driver (JD) employ enums for EventType and 
StateType

[degenaro] UIMA-5110 DUCC Job Driver (JD) employ enums for EventType and 
StateType

[cwiklik] UIMA-5047 updated CPU monitoring and reporting

--
[...truncated 7462 lines...]
 INFO |  driverId:999 host: node4 size: 24
 INFO |  driverId:999 host: node4 size: 23
 INFO |  driverId:999 host: node3 size: 14
 INFO |  driverId:999 host: node3 size: 13
 INFO |  driverId:999 host: node4 size: 22
 INFO |  driverId:999 host: node4 size: 21
 INFO |  driverId:999 host: node4 size: 20
 INFO |  driverId:999 host: node4 size: 19
 INFO |  driverId:999 host: node3 size: 12
 INFO |  driverId:999 host: node4 size: 18
 INFO |  driverId:999 host: node4 size: 17
 INFO |  driverId:999 host: node4 size: 16
 INFO |  driverId:999 host: node4 size: 15
 INFO |  map is empty
 INFO |  map is empty
 INFO |  saving 
to:
 INFO |  
saved:
 INFO |  map is empty
 INFO |  map is empty
 INFO |  driverId:999 host: node3 size: 11
 INFO |  driverId:999 host: node4 size: 14
 INFO |  driverId:999 host: node3 size: 10
 INFO |  driverId:999 host: node4 size: 13
 INFO |  driverId:999 host: node3 size: 9
 INFO |  driverId:999 host: node4 size: 12
 INFO |  driverId:999 host: node4 size: 11
 INFO |  map is empty
 INFO |  map is empty
 INFO |  saving 
to:
 INFO |  
saved:
 INFO |  driverId:999 host: node4 size: 10
 INFO |  driverId:999 host: node4 size: 9
 INFO |  driverId:999 host: node3 size: 8
 INFO |  driverId:999 host: node4 size: 8
 INFO |  driverId:999 host: node3 size: 7
 INFO |  driverId:999 host: node3 size: 6
 INFO |  map is empty
 INFO |  map is empty
 INFO |  saving 
to:
 INFO |  
saved:
 INFO |  driverId:999 host: node4 size: 7
 INFO |  driverId:999 host: node3 size: 5
 INFO |  driverId:999 host: node3 size: 4
 INFO |  map is empty
 INFO |  map is empty
 INFO |  saving 
to:
 INFO |  
saved:
 INFO |  map is empty
 INFO |  map is empty
 INFO |  driverId:999 host: node4 size: 6
 INFO |  driverId:999 host: node4 size: 5
 INFO |  driverId:999 host: node4 size: 4
 INFO |  map is empty
 INFO |  map is empty
 INFO |  saving 
to:
 INFO |  
saved:
 INFO |  driverId:999 host: node4 size: 3
 INFO |  map is empty
 INFO |  map is empty
 INFO |  saving 
to:
 INFO |  
saved:
 INFO |  driverId:999 host: node4 size: 2
 INFO |  driverId:999 host: node3 size: 3
 INFO |  driverId:999 host: node3 size: 2
 INFO |  driverId:999 host: node4 size: 1
 INFO |  map is empty
 INFO |  map is empty
 INFO |  saving 
to:
 INFO |  
saved:
 INFO |  driverId:999 host: node4 size: 0
 INFO |  driverId:999 host: node3 size: 1
 INFO |  map is empty
 INFO |  map is empty
 INFO |  current[Completed] previous[Assigned]
 INFO |  current[Completed] previous[Assigned]
 INFO |  saving 
to:
 INFO |  

Build failed in Jenkins: UIMA-DUCC ยป Apache UIMA DUCC: uima-ducc-agent #988

2016-09-16 Thread Apache Jenkins Server
See 


Changes:

[cwiklik] UIMA-5047 updated CPU monitoring and reporting

--
[INFO] 
[INFO] 
[INFO] Building Apache UIMA DUCC: uima-ducc-agent 2.2.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ uima-ducc-agent ---
[TASKS] Scanning folder 
' 
for files matching the pattern '**/*.java' - excludes: 
[TASKS] Found 57 files to scan for tasks
Found 1 open tasks.
[TASKS] Computing warning deltas based on reference build #987
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-versions) @ 
uima-ducc-agent ---
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:parse-version (parse-project-version) 
@ uima-ducc-agent ---
[INFO] 
[INFO] --- uima-build-helper-maven-plugin:7:parse-date-time (set buildYear and 
buildMonth) @ uima-ducc-agent ---
[INFO] 
[INFO] --- buildnumber-maven-plugin:1.4:create (default) @ uima-ducc-agent ---
[INFO] Executing: /bin/sh -c cd 
' 
&& 'svn' '--non-interactive' 'info'
[INFO] Working directory: 

[INFO] Storing buildNumber: 1761098 at timestamp: 1474063148410
[INFO] Executing: /bin/sh -c cd 
' 
&& 'svn' '--non-interactive' 'info'
[INFO] Working directory: 

[INFO] Storing buildScmBranch: trunk
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
uima-ducc-agent ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
uima-ducc-agent ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
uima-ducc-agent ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 57 source files to 

[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] 
:[48,39]
 incompatible types
  required: long
  found:java.lang.String
[INFO] 1 error
[INFO] -


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-09-16 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15498033#comment-15498033
 ] 

Marshall Schor commented on UIMA-5106:
--

only because the low level cas address in v2 is not guaranteed to remain 
preserved across various serializations/deserializations.

This would be "elevating" this previously "internal use" value (that many users 
made use of, in spite of it's non-guarantees of stability), to a more official 
and stable status.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: uv3 iterators - success in avoiding all concurrent modification exceptions

2016-09-16 Thread Marshall Schor
re: remove the need for the snapshot iterators then?

Yes, mostly.  There's one other use for those iterators, I think - they can in
unusual circumstances, speed things up (but mostly, they slow things down a
little). The speed up happens if you're doing a fully sorted index with lots of
subtypes interleaved and do multiple moves forwards and backwards.  The snapshot
"flattens" the interleaved nature (if I remember correctly), and then the
forwards and backwards movement occurs more efficiently, without "rattling" the
multiple iterators (one per type) as they interleave.

-Marshall


On 9/16/2016 4:20 PM, Richard Eckart de Castilho wrote:
> On 16.09.2016, at 22:06, Marshall Schor  wrote:
 Does this seem like a good thing to try?
> Definitely sounds promising. So that would remove the need for the snapshot 
> iterators then?
>
> Cheers,
>
> -- Richard



the hook from svn update to Jira has been restored, nothing was lost

2016-09-16 Thread Marshall Schor
See

https://issues.apache.org/jira/browse/INFRA-12551

The updates were not lost, it appears.

-Marshall



Re: uv3 iterators - success in avoiding all concurrent modification exceptions

2016-09-16 Thread Richard Eckart de Castilho
On 16.09.2016, at 22:06, Marshall Schor  wrote:
> 
>>> Does this seem like a good thing to try?

Definitely sounds promising. So that would remove the need for the snapshot 
iterators then?

Cheers,

-- Richard

[jira] [Commented] (UIMA-1524) JFSIndexRepository should be enhanced with new generic methods

2016-09-16 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497310#comment-15497310
 ] 

Marshall Schor commented on UIMA-1524:
--

2) Yes.  
There's a small note in the wiki diagram which says that all builder methods 
that are "booleans" have two forms:
xxx()  // set the property
xxx(boolean v) // set the property to the passed in value

where the xxx() is chosen to be the one most often wanted / used.

typePriorites is one of these booleans, so, yes, there will also be a 
typePriorites(boolean).

4) I'm fine with special casing this.  I would not even bother to add the other 
include... unless people wanted it (wait until it's wanted, if at all).

5) re using skip(n) to handle offsets:
Do you mean **in addition** to having positional arg forms?  If so, I'm fine 
with that, but prefer the term "offset".
One reason is that skip(nnn) is already taken - it's an official stream method, 
and requires non-negative arg.

If you mean **in place of positional arg forms**, then I'm slightly against 
that.

6) my question about uimaFIT was only about the iteration direction: if doing a 
select the preceding 3 annotations before myFS, what order is the iteration 
done?
forward or reverse?  I was proposing to do reverse if that's what the majority 
of users seemed to want/expect, but again, I don't really care...  people could 
always add reverse...

7) What I was trying to say, was you could specify a spot, and then an + or - 
offset to get to another spot, and then (independently) iterate from there in 
either direction.
I think it could be confusing if the meaning of offset + or - depended on state 
of reverse.  I'm thinking it's clearer and less subject to mental mistakes if 
offset is always with respect to the underlying index (assumed in the forward 
direction), and reverse just applied to how the iteration goes, once you start 
iterating.

So in my model the order of the items is fixed.  Reverse make the iteration go 
backwards (toward the front of the order).  Offset is with respect to the basic 
order.
I don't want reverse().reverse() to be a no op - that is to reverse the current 
direction... that seems just asking for trouble in having this be 
understandable.
We will of course have reverse(boolean) though (see point 2 above).

8) I was thinking that get() would continue to throw an exception if there was 
not exactly one instance.  
get(int arg) would not.  
(trying to have it both ways :-) )

> JFSIndexRepository should be enhanced with new generic methods
> --
>
> Key: UIMA-1524
> URL: https://issues.apache.org/jira/browse/UIMA-1524
> Project: UIMA
>  Issue Type: Improvement
>  Components: Core Java Framework
>Affects Versions: 2.3
>Reporter: Joern Kottmann
>
> Existing methods should be overloaded with an additional Class argument to 
> specify the exact return type. This changes make down casting of returned 
> objects unnecessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (UIMA-4210) Client hangs with more than 1 time-out

2016-09-16 Thread Jerry Cwiklik (JIRA)

 [ 
https://issues.apache.org/jira/browse/UIMA-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Cwiklik closed UIMA-4210.
---
Resolution: Cannot Reproduce

Closing since I am not able to reproduce the problem. The cause of the hang was 
fixed in one of the other JIRAs mentioned in the thread.

> Client hangs with more than 1 time-out
> --
>
> Key: UIMA-4210
> URL: https://issues.apache.org/jira/browse/UIMA-4210
> Project: UIMA
>  Issue Type: Bug
>  Components: Async Scaleout
>Affects Versions: 2.4.2AS
> Environment: Java 7, Mac OS
>Reporter: Frank Xu
>  Labels: client, hangs
> Fix For: 2.9.0AS
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> The client hangs if the execution has two time-outs. After debugging into the 
> issue, we figure out that the resending mechanism has some bugs in it. Here 
> are the detailed description.
> Please review the necessity for the invocation of sendCAS(). In our system, 
> we don't have to resend the CAS to process again. Please provide a 
> configuration so that we don't have to resend the CAS every time there is a 
> time out.
> Whenever there is the first time-out, 
> BaseUIMAAsynchronousEngine_impl#notifyOnTimout() is invoked and it hangs when 
> it tries to invoke sendCas() at line 2385. I believe the reason is that the 
> sendCAS() is a synchronized method and a potential threading issue causes 
> this thread hang over there. Please be noted that this block is also 
> synchronized.
> Then when there is a second time-out, it will be hanging in the very 
> beginning of the method notifyOnTimeout() and cannot clear the time-out CAS 
> from the CAS list, which hangs the entire client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-1524) JFSIndexRepository should be enhanced with new generic methods

2016-09-16 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496754#comment-15496754
 ] 

Marshall Schor commented on UIMA-1524:
--

I updated the wiki with some notes.  Some thoughts:

1) The word "all" can be used to mean all types in one view, or (for some type 
perhaps) all FSs from all views.
The .allViews() builder modifies things to use all views (with no index, 
implies unordered).
To get "all" FSs without regard to type, you leave out the type specification.
  - Note: this doesn't necessarily get all FSs without regard to type; it 
depends on the index specification
-- If no index specification, then it gets all FSs which are subtypes of 
TOP (i.e., all)
-- If index spec, it gets all FSs in that index (uses the type that's 
included in the index spec as the top-most type).

2) There are multiple methods to get FSs relative to a bounding begin and end, 
and maybe also using type priorities.
   - covered, between, within: some of the proposed words for this, some taking 
1 or 2 fs's, or 2 ints (begin / end)
   - I know uimaFIT excludes type priorities.  I think this is more often what 
users want, so it's probably good to be the "default".
 -- the builder typePriorites(), for cases where the bounds are supplied by 
FSs, can change the criteria for bounding to use begin, end, and type 
priorities.

3) The above also applies to selection filters that bound a given begin and 
end, or a given FS
  - covering, containing, taking 1 FS or 2 ints begin/end

4) the following or preceding: the official "limit" method for streams throws 
exception on negative args, so I don't want to behave differently...
  - startAt, at, seek, following/preceding (I tend to like "at", and 
following/preceding where that form has an additional arg used as the limit 
value).
  - the arg for where to start is used to specify a location in the index; if 
the index happens to contain that arg as a FS then, it's part of the result; 
unless we want to special-case this like uimaFIT seems to do.

5) Since Uima iterators support forward/backwards, the startAt could 
efficiently be augmented with a +- offset. Variations:
  - startAt(begin, end) - start at position of left-most FS >= that begin / end
  - startAt(fs) - start at position of left-most FS >= that fs (ignoring type 
priorities unless specified)
  - The same 2 with an extra int as last arg: this is the offset 
  - NOTE: the forms with begin end only work with AnnotationIndex; others work 
with any ordered index

6) following /preceding: combining an "at" spec with a "limit" spec, and 
implying a "reverse" spec for preceding, not sure if uimaFIT does this? or if 
this is a good idea?
  - jcas.select().following(3, fs);  // select the 3 following FSs >= fs, 
ignoring typePriorities
  - jcas.select().following(3, fs 2);  // select the 3 following FSs >= fs 
offset by + 2...

7) Just noting that reverse is independent of offset
- e.g. you can have a negative offset, and traverse in a positive direction

8) "single" seems kind of awkward.  I'm thinking of just "get()" or get(arg) 
where the arg is the same as used in startAt
   -  jcas.select(Token.type).get(15); 

> JFSIndexRepository should be enhanced with new generic methods
> --
>
> Key: UIMA-1524
> URL: https://issues.apache.org/jira/browse/UIMA-1524
> Project: UIMA
>  Issue Type: Improvement
>  Components: Core Java Framework
>Affects Versions: 2.3
>Reporter: Joern Kottmann
>
> Existing methods should be overloaded with an additional Class argument to 
> specify the exact return type. This changes make down casting of returned 
> objects unnecessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: uv3 iterators - success in avoiding all concurrent modification exceptions

2016-09-16 Thread Marshall Schor
One other benefit: UIMA automatically may "under-the-covers" remove and add back
some FSs if you update some features used as keys in indexes.  This could cause
ConcurrentModificationException if you had loops that did this, even though you
had no index operations coded explicitly as part of the loop.

-Marshall Schor


On 9/16/2016 3:59 PM, Marshall Schor wrote:
> As an experiment, I implemented a copy-on-write style of concurrent 
> modification
> exception prevention in UV3.
>
> It does minimal copying, only copying part of the index related to the
> particular type being updated; if no iterators are in use, there's no copying
> (but see below).
>
> The copy is done just once, even for multiple iterators, unless a subsequent
> iterator is created after another update has happened to that part of the 
> index.
>
> With this, you get a trade-off: no more concurrent modification exceptions; 
> you
> can modify indexes within loops, but (incrementally) copies are made of index
> parts if needed.  So it takes more space and time, due to copies sometimes 
> being
> made.
>
> In the following case, no copies will be made:
>
>   a) modify the indexes
>
>   b) create an iterator, iterate, then drop references to the iterator, and 
> have
> the garbage collector gc it.
>
>   c) repeat a and b as much as you like.
>
> If you're through with an iterator, but it hasn't been GC'd yet, then the
> modification code can't tell your through with the iterator, and has to make 
> a copy.
>
> Is this a good trade off to make?  Should we have 2 modes of running 
> pipelines -
> with/without this feature?
>
> -Marshall
>
> P.S. there's an edge case caught by the test cases.  In today's world, if you 
> do:
>a) modify the indexes
>b) start iterating
>c) modify the indexes
>d) do one of moveToFirst, Last, or just moveTo(fs), these "reset" the
> concurrent mod, and allow continuing use of the iterator, this time over the
> updated indexes.  I had to add some more details in the impl to make this work
> the same way... 
>
> On 9/14/2016 10:11 AM, Marshall Schor wrote:
>> Version 2 had snapshot iterators, used for two purposes:
>>
>> a) allowing underlying index modifications while iterating (over the 
>> snapshot).
>> Note that this includes even simple things like changing begin/end values in 
>> an
>> annotation (which could cause a remove/add-back to indexes action while those
>> features are changed).
>>
>> b) performance (in some edge cases, but also has a performance cost initially
>> (to create the snapshot))
>>
>> It might be reasonable to support case (a) more automatically.  One approach
>> might be to do a "copy on write" style for the index parts.  Java has, for
>> instance CopyOnWriteArrayList and CopyOnWriteArraySet.  This could add 1 more
>> level of indirection in using UIMA indexes; details need to be worked out and
>> could be complex (indexes need to be performant and thread-safe for reading).
>>
>> Does this seem like a good thing to try?
>>
>> -Marshall
>>
>>
>



[jira] [Created] (UIMA-5111) uv3 Experiments in avoiding ConcurrentModificationException for UIMA iterators

2016-09-16 Thread Marshall Schor (JIRA)
Marshall Schor created UIMA-5111:


 Summary: uv3 Experiments in avoiding 
ConcurrentModificationException for UIMA iterators
 Key: UIMA-5111
 URL: https://issues.apache.org/jira/browse/UIMA-5111
 Project: UIMA
  Issue Type: Sub-task
  Components: Core Java Framework
Reporter: Marshall Schor
Assignee: Marshall Schor
Priority: Minor
 Fix For: 3.0.0SDKexp


Experiment with alternative approaches to avoid 
ConcurrentModificationExceptions in iterators over UIMA indexes.  Try an 
approach using minimal, localized copy-on-write for parts of indexes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)