Of course you are right about the balance to be made for performance. Perhaps 
this is a chance for me to check my understanding of Jena's architecture: to my 
examination, in jena-core there is no possibility to control that balance 
because jena-core abstractions do not understand the differences between 
resources that are near compute and those farther away in the network. That 
only becomes apparent to modules like jena-tdb. 

Truthfully, the qualities that attract me to this change are not performance or 
power, but concision and clarity.

I'm very familiar with Guava's Iterators, Iterables, FluentIterable, etc. but I 
don't think they offer much more than Jena's ExtendedIterator now has with 
respect to API. I certainly wouldn't mind replacing some of Jena's 
implementation code with functions from Guava, which are exceedingly 
well-exercised, and if it seems reasonable to increase the footprint in Guava 
that now obtains in the codebase, I could do that as part of a migration of 
NiceIterator into ExtendedIterator. My overall aim here (which may or may not 
be a good or important one in the context of the whole project) is to replace a 
reasonable amount of the Jena-homegrown portions of both API and implementation 
with functionally- and ergonomically- equivalent-or-superior common property 
from the largest possible community.

As to fluent syntax for basic types, are you referring to the needful plethora 
of calls to ResourceFactory.createResource() and .createLiteral() and the like? 
(Because I'm not a big fan of that sort of thing, myself. {grin})

---
A. Soroka
The University of Virginia Library

On May 4, 2015, at 2:08 PM, Paul Houle <[email protected]> wrote:

> I use the JDK8 stream stuff a lot these days but it certainly has its
> discontents.  In particular the parallel stuff is based on the Fork/Join
> framework;  it seems to do OK on correctness,  which puts it ahead of some
> miracle frameworks for parallelization.  However,  if you understand the
> rough balance between concurrency overhead,  cpu time, and time spent
> waiting for resources far from the cpu,  you can quickly tune
> ExecutorService to get much better speedup more reliably and also pipeline
> tasks which makes a big difference.
> 
> Still I like the idea of being able to turn result sets to streams with a
> .stream() operator.
> 
> The Google guava library has a system that does stream()-like operations to
> Iterables and Iterators and right now I like the syntax better possibly
> because I have been using it so long (with Jena objects)
> 
> In the other direction you have Spark,  where you are writing what looks
> like the same kind of code but you have many options in terms of threads,
> clusters,  memory or on-disk,  etc.
> 
> ----
> 
> As for those statics,  I'd say I want to see a more fluent syntax for
> common Jena operations.  For instance,  I use the Jena in-memory model the
> way that most programmers use hashtables.  With the models you have all the
> cool "Resource" and "Property" types but you need to write code to create
> these things to put them in all the slots and it starts to obscure the
> simplicity of what is going on.
> 
> 
> 
> On Mon, May 4, 2015 at 11:50 AM, [email protected] <[email protected]>
> wrote:
> 
>> Thank you for the "heads up": I was unaware of Commons Functor. It is nice
>> to see the Commons project put a product in that space. I notice that
>> Functor's basic types do not inherit from the recently introduced Java 8
>> types (e.g. Function, BiFunction), and that in fact, by a glance at some of
>> its POMs, Functor seems to be using Java 5. Is there some expectation of
>> moving that forward, or is Functor expected to "bridge" older versions of
>> Java?
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>> On May 2, 2015, at 7:14 PM, Bruno P. Kinoshita <[email protected]> wrote:
>> 
>>>> It would let Jena cut out a fair bit of API and implementation code in
>> favor of letting Java itself do the work.
>>> 
>>> +1
>>>> Does this seem like a useful direction of work? I believe it could be
>> undertaken without being disruptive, and even without too much code churn
>> except when introducing Stream into the core. If it sounds like a good
>> idea, I would be happy to begin it.
>>> I will take a look at each item later, but probably others can confirm
>> whether that makes sense or not, since I'm still getting myself more
>> familiar with Jena code base.
>>> 
>>> But on a side note, I'm planning to start a few dev cycles on Apache
>> Commons Functor in June/July. The idea of the project is to provide FP
>> extensions to Java, much like Commons Lang does for the general language.
>> If while you are working on adding Java 8 to Jena you find yourself
>> creating code that you think could be useful for other projects, please
>> feel free to submit an issue to
>> https://issues.apache.org/jira/browse/FUNCTOR or ping the commons dev
>> mailing list :-)
>>> All the bestBruno
>>> 
>>> 
>>>     From: "[email protected]" <[email protected]>
>>> To: [email protected]
>>> Sent: Saturday, May 2, 2015 6:05 AM
>>> Subject: Java 8 Streams Was: What can be removed/simplified ?
>>> 
>>> I've noticed a few more places where some Java 8 changes could be
>> brought into play in the interest of simplification, and in particular, the
>> use of Java 8 Streams seems like a nice way to go. It would let Jena cut
>> out a fair bit of API and implementation code in favor of letting Java
>> itself do the work. Here is a small program of incremental changes that I'd
>> like to propose:
>>> 
>>> - We could move NiceIterator's methods up into ExtendedIterator as
>> default implementations and factor NiceIterator out of existence.
>>> 
>>> - Then, we could migrate the API of ExtendedIterator to be a close
>> analog to a subset of the API of Java 8's Stream. (It's not too far away
>> right now.)
>>> 
>>> - Then, we could begin replacing the use of ExtendedIterator, its
>> subtypes (e.g. StmtIterator), and their implementations with Java 8
>> Streams. That will certainly take a few steps in itself, since
>> ExtendedIterator is in use all over, but I'm confident (perhaps arrogantly
>> so {grin}) that replacing its use at some fairly low-lying levels (I think
>> around and just below TripleStore.find(Triple)) will allow some quick
>> replacement moves at the levels above.
>>> 
>>> - Then, we could begin exposing Stream<T>s in the signatures of new
>> methods on very public-facing types like Model. For example, by analogy to
>> Model.listSubjects() returning ResIterator, there could also be
>> Model.streamSubjects() returning Stream<Resource>.
>>> 
>>> And then, I hope, the community would begin migrating away from the
>> ExtendedIterator methods and to the Java 8 Stream<T> methods, because
>> Stream has so much attractive functionality available.
>>> 
>>> Does this seem like a useful direction of work? I believe it could be
>> undertaken without being disruptive, and even without too much code churn
>> except when introducing Stream into the core. If it sounds like a good
>> idea, I would be happy to begin it.
>>> 
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>> 
>>> 
>>> 
>>> On May 1, 2015, at 12:19 PM, Claude Warren <[email protected]> wrote:
>>> 
>>>> An example is:
>>>> 
>>>> org.apache.jena.security.utils.RDFListSecFilter
>>>> 
>>>> Which filters results based on user access and is used whereever a
>> RDFList
>>>> (or an iterator on one) is returned .
>>>> 
>>>> Claude
>>>> 
>>>> On Fri, May 1, 2015 at 5:12 PM, [email protected] <[email protected]>
>>>> wrote:
>>>> 
>>>>> Oh, now I think I understand your point better.
>>>>> 
>>>>> Yes, I have already trawled that code and worked over those reusable
>> guys,
>>>>> and yes, you will certainly still be able to combine and reuse
>> Predicates
>>>>> in the same way that you have used Filters. When I get this PR in, you
>> can
>>>>> see some examples of that.
>>>>> 
>>>>> A Java 8 Predicate is just an interface that looks much like Jena's
>>>>> Filter, which can benefit from the -> lamda syntax and which is
>> designed to
>>>>> fit into the Java 8 language APIs (e.g. for use with Streams).
>>>>> 
>>>>> ---
>>>>> A. Soroka
>>>>> The University of Virginia Library
>>>>> 
>>>>> On May 1, 2015, at 12:07 PM, Claude Warren <[email protected]> wrote:
>>>>> 
>>>>>> We have a number of places where Filter objects are created and reused
>>>>>> (usueally due to complexity or to reduce the code footprint in terms
>> of
>>>>>> debugging).  Will it still be possible to define these complex filters
>>>>> and
>>>>>> use them in multiple places.
>>>>>> 
>>>>>> The permissions system does this in that it creates a filter for
>> RDFNodes
>>>>>> and then applies them to the 3 elements in a triple to create a single
>>>>>> filter for triples.
>>>>>> 
>>>>>> There are several cases like this.
>>>>>> 
>>>>>> I will have to look at the permissions code to find a concrete
>> example,
>>>>> but
>>>>>> I think this is the case.
>>>>>> 
>>>>>> Claude
>>>>>> 
>>>>>> On Fri, May 1, 2015 at 4:53 PM, [email protected] <
>> [email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>>>> As for the Filter implementation..... will that be transparant to
>>>>> filter
>>>>>>> implementations?  I assume so.
>>>>>>> 
>>>>>>> I think this was in response to my question about Filter?
>>>>>>> 
>>>>>>> If you mean that things that currently implement Filter (outside of
>>>>> Jena's
>>>>>>> own code) will not be greatly affected, then yes, so I would hope. I
>>>>> will
>>>>>>> @Deprecated Filter and its methods, but that seems to me to be all
>> that
>>>>> is
>>>>>>> needed for this first step.
>>>>>>> 
>>>>>>> I should have a PR with this later today, when you can observe some
>> real
>>>>>>> code and give me feedback.
>>>>>>> 
>>>>>>> ---
>>>>>>> A. Soroka
>>>>>>> The University of Virginia Library
>>>>>>> 
>>>>>>> On May 1, 2015, at 11:47 AM, Claude Warren <[email protected]> wrote:
>>>>>>> 
>>>>>>>> I don't see any reason not to remove the Node functions.
>>>>>>>> 
>>>>>>>> As for the Filter implementation..... will that be transparant to
>>>>> filter
>>>>>>>> implementations?  I assume so.
>>>>>>>> 
>>>>>>>> On Fri, May 1, 2015 at 4:16 PM, Andy Seaborne <[email protected]>
>> wrote:
>>>>>>>> 
>>>>>>>>> (mainly for Claude - I did check jena-pemissions and didn't see any
>>>>>>> usage)
>>>>>>>>> 
>>>>>>>>> There are a bunch of deprecated statics in Node (the correct way
>> is to
>>>>>>> use
>>>>>>>>> NodeFactory)
>>>>>>>>> 
>>>>>>>>> Node.createAnon()
>>>>>>>>> Node.createAnon(AnonId)
>>>>>>>>> Node.createLiteral(LiteralLabel)
>>>>>>>>> Node.createURI(String)
>>>>>>>>> Node.createVariable(String)
>>>>>>>>> Node.createLiteral(String)
>>>>>>>>> Node.createLiteral(String, String, boolean)
>>>>>>>>> Node.createLiteral(String, String, RDFDatatype)
>>>>>>>>> Node.createLiteral(String, RDFDatatype)
>>>>>>>>> Node.createUncachedLiteral(Object, String, RDFDatatype)
>>>>>>>>> Node.createUncachedLiteral(Object, RDFDatatype)
>>>>>>>>> 
>>>>>>>>> It looks like they are not used by the jena codebase and are there
>> for
>>>>>>>>> compatibility only.
>>>>>>>>> 
>>>>>>>>> Any reason not to remove them?
>>>>>>>>> 
>>>>>>>>>      Andy
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> I like: Like Like - The likeliest place on the web
>>>>>>>> <http://like-like.xenei.com>
>>>>>>>> LinkedIn: http://www.linkedin.com/in/claudewarren
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> I like: Like Like - The likeliest place on the web
>>>>>> <http://like-like.xenei.com>
>>>>>> LinkedIn: http://www.linkedin.com/in/claudewarren
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> I like: Like Like - The likeliest place on the web
>>>> <http://like-like.xenei.com>
>>>> LinkedIn: http://www.linkedin.com/in/claudewarren
>>> 
>>> 
>>> 
>> 
>> 
> 
> 
> -- 
> Paul Houle
> 
> *Applying Schemas for Natural Language Processing, Distributed Systems,
> Classification and Text Mining and Data Lakes*
> 
> (607) 539 6254    paul.houle on Skype   [email protected]
> https://legalentityidentifier.info/lei/lookup
> <http://legalentityidentifier.info/lei/lookup>

Reply via email to