Re: proposal to change return type for size() in graph

Andy Seaborne Wed, 06 Nov 2013 02:49:52 -0800

We release as a whole so all modules changing at once is do able for us.

External implementation don't seem to track versions very closely (yearsof difference) so all this deprecation cycle stuff can only work on avery long timescale. Also, they don't allow drop-in later versions ofJena onto old versions of their implementation, which is the killer forsmooth changes.

So one option is just make the change. Any smoothed transition is not,in practice, helping anyone.


Or a transition might be:

We could @Deprecate int/size(), make it return Integer.MAX_VALUE meaning"go ask another method" and add long/size2() that returns the properanswer. GraphBase implements a not-preferred version where size() callssize2().


We switch all out code to use size2() [1].

This is still an interface incompatibility but possibly smoother.People using GraphBase have to recompile as they change version of Jena(or maybe not - all the right methods exist and don't change).

"Possibly" because of the long lag on versions we see anyway. Otherchanges, and we have to have the scope to make other changes somehow, dosufficiently frequently stop drop-in upgrade to old systems.


Or.

Jena3.  Interface spring cleaning.  Other changes.

The data change around xsd:string which might warrant Jena3.

I want to avoid getting into in a long trough for Jena3 so I'm lookingfor how we'd get out of the change phase rather than just how to getinto it.

Maybe we start running two codebases in parallel for a while, Jena2being "maintenance only". If we delay package renaming for a while, it'squite easy to roll J3 fixes back into J2.


Of course, we have the version-lag to take into account.

JIRA is a good place to collect ideas and thoughts:

JENA-189 (Jena3/technical)
JENA-193 (RDF 1.1)

Other JIRA include:

JENA-190 (delivery)
JENA-191 (module structure)
JENA-192 (package naming)

        Andy

PS Not a double please - a long is large enough and doubles have lessprecision. 2^63-1 really is a very large number - 8 exa-triples. Andin java8 2^64-1 (sortof).


[1] Eclipse will do it all in on click.

On 06/11/13 08:53, Claude Warren wrote:

ON further consideration, perhaps sizeEstimate could return a Numeric
Literal Node.  This would provide the ability to return very large numbers
as doubles and smaller numbers as ints and we already have the code to
convert those values to primitive numbers or Number instances.


On Wed, Nov 6, 2013 at 7:32 AM, Claude Warren <[email protected]> wrote:

I don't see how to transition unless we change the method name to
something like sizeEstimate and return a double.  I think in most cases
size is used to determine which side of a join should go on the left for
efficiency and for unit tests.  We might want to return a statistical
answer X +/- Y (sort of like the delta in the junit
assert.equals(double,double,delta) tests )  But this is probably stretching
a bit too far.

Claude


On Tue, Nov 5, 2013 at 10:28 PM, Andy Seaborne <[email protected]> wrote:

On 04/11/13 12:22, Claude Warren wrote:

Currently graph.size() returns an int.  the maximum value for an int
is  2,147,483,647 (2.1 billion) though the model.size() returns a long.

Does it make sense to change the return type for graph.size() to long?

If not and a graph exceeds 2.1B triples should size just return
Integer.MAX_VALUE.

I ask as I am currently working on a project to load all of DBPedia (2.46
billion triples) into a graph.

Claude

Good idea.

How would you see the change being made? (any transition process?)

         Andy



--
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: proposal to change return type for size() in graph

Reply via email to