Just posted a question regarding Spark because I'm heading down the
streaming route as we're aggregating multiple large datasets together and
our 1.5TB TDB was causing us some issues. We have many large graph writes
of between 1-4Mb triples which I currently write to a number of TDB's and
use a set of streaming utility methods to aggregate the TDB's for the find
methods. This lends itself to RDD filter calls.


On 16 Dec 2016 21:22, "Andy Seaborne" <a...@apache.org> wrote:

There are elements of that - see CommonsRDF - though here the operations
are whole objects (dataset - to query is as a Stream<Tuple> would to
collect the tuples).

It is also like building up a executable pipeline of operations but not
doing it until the final step which allows optimization of the pipeline.

c.f. Apache spark.


On 16/12/16 15:43, A. Soroka wrote:

> It seems to me that these ideas begin to border on the Stream API, with
> something like Stream<Tuple> at work.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> On Dec 15, 2016, at 3:46 PM, Andy Seaborne <a...@apache.org> wrote:
>>
>>
>> A more considered solution:
>>
>> https://gist.github.com/afs/2b8773d10cbe4bc1161e9851de02b3eb
>>
>>         Andy
>>
>> On 14/12/16 12:52, Andy Seaborne wrote:
>>
>>>
>>>
>>> On 14/12/16 11:23, Martynas Jusevičius wrote:
>>>
>>>> But that would still require the functional subclasses of Query?
>>>>
>>>
>>> Yes, but it required no changes to the jena code.  There could be a
>>> library of such utilities.
>>>
>>>    static <T,X> T apply(X object, Function<X, T> f) {
>>>        return f.apply(object);
>>>    }
>>>
>>>    static <X> void apply(X object, Consumer<X> c) {
>>>        c.accept(object);
>>>    }
>>>
>>>
>>>
>>>        Dataset dataset = ... ;
>>>        Select selectQuery = new Select("SELECT * { ?s ?p ?o}");
>>>        ResultSet rs = selectQuery.apply(dataset);
>>>        Consumer<ResultSet> rsp = (t)->ResultSetFormatter.out(t);
>>>        apply(apply(dataset, selectQuery), rsp);
>>>
>>> (because Consumer isn't a Function<,Void> :-()
>>>
>>>    Andy
>>>
>>>>
>>>> On Wed, Dec 14, 2016 at 11:37 AM, Andy Seaborne <a...@apache.org>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 12/12/16 21:45, Martynas Jusevičius wrote:
>>>>>
>>>>>>
>>>>>> Well, this probably requires some generic method(s) in Dataset/Model
>>>>>> as well, something like:
>>>>>>
>>>>>>  T apply(Function<Dataset, T> f);
>>>>>>
>>>>>> This would allow nice chaining of multiple queries, e.g DESCRIBE and
>>>>>> SELECT:
>>>>>>
>>>>>>  ResultSet results = dataset.apply(describe).apply(select);
>>>>>>
>>>>>
>>>>>
>>>>> No need to extend dataset and model and the rest to get experimenting:
>>>>>
>>>>> static <T,X> T apply(X object, Function<X, T> f) {
>>>>>   return f.apply(object);
>>>>> }
>>>>> // BiFunction<X, Function<X, T>, T>
>>>>>
>>>>>
>>>>> then
>>>>>
>>>>> ResultSet results = apply(
>>>>>                       apply(dataset, describe),
>>>>>                       select);
>>>>>
>>>>>
>>>>> The function f does have any access to the internals of a specific
>>>>> dataset
>>>>> so it does not need to be a method of Dataset.
>>>>>
>>>>> There is a style thing about how it looks if you are not used to
>>>>> reading
>>>>> functional application (i.e. backwards!).
>>>>>
>>>>>    Andy
>>>>>
>>>>>
>>>>>
>>>>>> Seems more elegant to me than all the QueryExecution boilerplate.
>>>>>>
>>>>>> On Mon, Dec 12, 2016 at 9:00 PM, A. Soroka <aj...@virginia.edu>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> What are the kinds of usages to which you are imagining these kind of
>>>>>>> types being put?
>>>>>>>
>>>>>>> ---
>>>>>>> A. Soroka
>>>>>>> The University of Virginia Library
>>>>>>>
>>>>>>> On Dec 12, 2016, at 2:03 PM, Martynas Jusevičius
>>>>>>>> <marty...@graphity.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hey,
>>>>>>>>
>>>>>>>> has Jena considered taking advantage of the functional features in
>>>>>>>> Java
>>>>>>>> 8?
>>>>>>>>
>>>>>>>> What I have in mind is interfaces like:
>>>>>>>>
>>>>>>>> Construct extends Query implements Function<Dataset, Model>
>>>>>>>>
>>>>>>>> Describe extends Query implements Function<Dataset, Model>
>>>>>>>>
>>>>>>>> Select extends Query implements Function<Dataset, ResultSet>
>>>>>>>>
>>>>>>>> Ask extends Query implements Function<Dataset, ResultSet>
>>>>>>>>
>>>>>>>>
>>>>>>>> Martynas
>>>>>>>> atomgraph.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>

Reply via email to