+1 for doing this change in Jena3 Suggest we just change the return value to a long and add documentation that clarifies that this is an estimate of size (current docs already imply this) and that if more than Long.MAX_VALUE triples are present return MAX_VALUE.
-1 on delaying package renaming. 0 on git usage. I havn't used it much and when I have it drives me crazy but that is probably because I don't use it enough. As long as there is patience with git checkin screwups I'm OK with moving to Git. Claude On Wed, Nov 6, 2013 at 12:41 PM, Rob Vesse <[email protected]> wrote: > Comments inline: > > On 06/11/2013 10:48, "Andy Seaborne" <[email protected]> wrote: > > >We release as a whole so all modules changing at once is do able for us. > > > >External implementation don't seem to track versions very closely (years > >of difference) so all this deprecation cycle stuff can only work on a > >very long timescale. Also, they don't allow drop-in later versions of > >Jena onto old versions of their implementation, which is the killer for > >smooth changes. > > > >So one option is just make the change. Any smoothed transition is not, > >in practice, helping anyone. > > I agree, see my later comments on package rename but it would be nice to > just change the API on the Jena 3 branch and leave those who want to stick > with Jena 2 to lag behind as they will. Moving to Jena 3 potentially > allows us to ignore niceties like deprecation cycles and just simply > remove/change stuff as necessary. To aid transition we can always mark > things as deprecated on the Jena 2 branch with notes that the API is > changing in Jena 3. > > > > >Or a transition might be: > > > >We could @Deprecate int/size(), make it return Integer.MAX_VALUE meaning > >"go ask another method" and add long/size2() that returns the proper > >answer. GraphBase implements a not-preferred version where size() calls > >size2(). > > > >We switch all out code to use size2() [1]. > > > >This is still an interface incompatibility but possibly smoother. > >People using GraphBase have to recompile as they change version of Jena > >(or maybe not - all the right methods exist and don't change). > > > >"Possibly" because of the long lag on versions we see anyway. Other > >changes, and we have to have the scope to make other changes somehow, do > >sufficiently frequently stop drop-in upgrade to old systems. > > > >Or. > > > >Jena3. Interface spring cleaning. Other changes. > > +1 > > > > >The data change around xsd:string which might warrant Jena3. > > > >I want to avoid getting into in a long trough for Jena3 so I'm looking > >for how we'd get out of the change phase rather than just how to get > >into it. > > A first pass for Jena 3 would literally be package rename, obvious > interface changes like this one and then push out an initial release. > > > > >Maybe we start running two codebases in parallel for a while, Jena2 > >being "maintenance only". If we delay package renaming for a while, it's > >quite easy to roll J3 fixes back into J2. > > +1 > > -1 to delaying package renaming since I feel that makes things trickier > than they need to be and doesn't help version laggers if they pick up > 3.0.0 and the APIs are virtually the same and then 3.1.0 changes all the > package names. > > Back-porting to Jena 2 will probably mostly just require a Find/Replace on > com.hp.hpl.jena to org.apache.jena so I don't see this as a reason to > delay the package rename if we're going to do Jena 3 anyway. > > Moving our source control to git would make maintaining parallel branches > and back porting changes much easier. We can then take advantage of > things like git cherry-pick to aid back porting bug fixes from Jena 3 to > Jena 2. So I would suggest we proceed to move to git and set up > appropriate branches for this workflow. > > Rob > > > > >Of course, we have the version-lag to take into account. > > > >JIRA is a good place to collect ideas and thoughts: > > > >JENA-189 (Jena3/technical) > >JENA-193 (RDF 1.1) > > > >Other JIRA include: > > > >JENA-190 (delivery) > >JENA-191 (module structure) > >JENA-192 (package naming) > > > > Andy > > > > > >PS Not a double please - a long is large enough and doubles have less > >precision. 2^63-1 really is a very large number - 8 exa-triples. And > >in java8 2^64-1 (sortof). > > > >[1] Eclipse will do it all in on click. > > > >On 06/11/13 08:53, Claude Warren wrote: > >> ON further consideration, perhaps sizeEstimate could return a Numeric > >> Literal Node. This would provide the ability to return very large > >>numbers > >> as doubles and smaller numbers as ints and we already have the code to > >> convert those values to primitive numbers or Number instances. > >> > >> > >> On Wed, Nov 6, 2013 at 7:32 AM, Claude Warren <[email protected]> wrote: > >> > >>> I don't see how to transition unless we change the method name to > >>> something like sizeEstimate and return a double. I think in most cases > >>> size is used to determine which side of a join should go on the left > >>>for > >>> efficiency and for unit tests. We might want to return a statistical > >>> answer X +/- Y (sort of like the delta in the junit > >>> assert.equals(double,double,delta) tests ) But this is probably > >>>stretching > >>> a bit too far. > >>> > >>> Claude > >>> > >>> > >>> On Tue, Nov 5, 2013 at 10:28 PM, Andy Seaborne <[email protected]> > wrote: > >>> > >>>> On 04/11/13 12:22, Claude Warren wrote: > >>>> > >>>>> Currently graph.size() returns an int. the maximum value for an int > >>>>> is 2,147,483,647 (2.1 billion) though the model.size() returns a > >>>>>long. > >>>>> > >>>>> Does it make sense to change the return type for graph.size() to > >>>>>long? > >>>>> > >>>>> If not and a graph exceeds 2.1B triples should size just return > >>>>> Integer.MAX_VALUE. > >>>>> > >>>>> I ask as I am currently working on a project to load all of DBPedia > >>>>>(2.46 > >>>>> billion triples) into a graph. > >>>>> > >>>>> Claude > >>>>> > >>>>> > >>>> Good idea. > >>>> > >>>> How would you see the change being made? (any transition process?) > >>>> > >>>> Andy > >>>> > >>>> > >>> > >>> > >>> -- > >>> I like: Like Like - The likeliest place on the > >>>web<http://like-like.xenei.com> > >>> LinkedIn: http://www.linkedin.com/in/claudewarren > >>> > >> > >> > >> > > > > > > > -- I like: Like Like - The likeliest place on the web<http://like-like.xenei.com> LinkedIn: http://www.linkedin.com/in/claudewarren
