Re: [protobuf] modifying lists within a message
seems odd that you can delete all via clearXXX() or add via addAllXXX() and addXXX() but not remove a single element by index... well, here's a work-around - if you put a utility class into the same package as the ProtobufMessages generated class it can access the protected Builder.internalGetResult() - since package-level access is higher than protected-level access... From: Jason Hsueh jas...@google.com To: Ron tequila...@ymail.com Cc: Protocol Buffers protobuf@googlegroups.com Sent: Wed, March 17, 2010 9:31:58 AM Subject: Re: [protobuf] modifying lists within a message Messages are indeed immutable once built. getXXXList() for both the builder and the message should return unmodifiable lists: if you try to modify them you should get an exception. If you haven't already, take a look at the Java tutorial here: http://code.google.com/apis/protocolbuffers/docs/javatutorial.html For your particular problem I think you'll need do a clearXXX() on the entire list, and add back all the elements except the entry you want to discard. On Wed, Mar 17, 2010 at 7:57 AM, Ron tequila...@ymail.com wrote: i didn't see anything on this in the archives but i admit i didn't do an exhaustive search. i'm using protobuf (2.3.0) to store objects in a Voldemort cluster and am working on the code to modify these objects. so i load the message fine, but it is composed of a list of other messages and i'd like to modify the contents of that list and then stream it back into the cluster. i assumed this is what Type.newBuilder(Type) was for, but have run into an interesting wrinkle trying to modify the list (once i've located the entry i want to remove by index). Builder.getXXXList() returns an unmodifiableList, but the actual message's getXXXList() returns a ref to the actual list (i.e., i could modify the list in the message, but not the list within the message within the builder). i haven't gone through all the code yet but this seemed rather odd since i had thought messages within builders were mutable until build() but once built were considered immutable (tho perhaps that's just in my head). is there a doc showing best practices for modifying Protobuf messages i should be reading? thanks, ...ron. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] modifying lists within a message
but this message isn't built yet - i'm trying to modify a copy of a previous message wrapping it in a newBuilder so i don't have to copy all the fields over from the previous (immutable) to the new (mutable until i call Builder.build()). the only thing (so far) that i can't do to that new message is remove a single element from its list variable. you mean it's O(n) because of the shift of all elements after the index since ArrayList is a continuous block? i wouldn't have called that O(n). certainly it's O(n) to find the element in the unsorted list, but that's unavoidable (unless i add code to put the elements in sorted...) once i find the index of the single element in the list i want i have to remove it and then add a new (copied and then modified from the original) element to replace it. (order is unimportant, at least in this case). and yes, it is true - classes in the same package have access to each others protected members. package and protected might MEAN something different but the way the JVM (the Sun JVM anyway) acts the access level is an int: 0 = public, 1 = protected, 2 = package, 3 = private - you can access everything = your access-level with respect to the other class. t'would be a pity for you to fix it - i was rather happy to figure that one out. :) in which case wiping the entire list and inserting N-1 elements back in will be my only (non-reflection-based) option (which i'm pretty sure is more costly than removing a single element from an ArrayList by index...). From: Kenton Varda ken...@google.com To: Ron Reynolds tequila...@ymail.com Cc: protobuf@googlegroups.com Sent: Wed, March 17, 2010 1:32:50 PM Subject: Re: [protobuf] modifying lists within a message Also note that we cannot return a modifiable list even from the Builder's getter because the caller could then hold on to that list and modify it after the message is built. We really need to prevent any kind of modifications from happening after the message is built. On Wed, Mar 17, 2010 at 1:31 PM, Kenton Varda ken...@google.com wrote: On Wed, Mar 17, 2010 at 12:59 PM, Ron Reynolds tequila...@ymail.com wrote: seems odd that you can delete all via clearXXX() or add via addAllXXX() and addXXX() but not remove a single element by index... Our experience is that if we provide a way to remove a single element by index, people will mistakenly think that the method is more efficient than clearing the whole list and rebuilding it, when in fact both operations are O(n). The result of this is that people write loops like: for (all elements) { if (shouldFilter(element)) { remove(element); } } This loop is O(n^2). well, here's a work-around - if you put a utility class into the same package as the ProtobufMessages generated class it can access the protected Builder.internalGetResult() - since package-level access is higher than protected-level access... Yikes, is that true? I don't think it is, but if so we need to fix it. I'm pretty sure protected is neither higher nor lower than package-level -- it means something entirely different. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Generated hashcode() returns different values across JVM instances?
the corner-stone of Hash* containers is: (A.equals(B)) = (A.hashCode() == B.hashCode()) for all A, B. tho it's not explicitly stated it would seem to be implied that is within a single JVM. not sure if the code in question maintains that rule within a JVM (if not that's a big deal). if so that would seem sufficient for all but the most distributed of Hash* containers, such as where a client which is remote from the storage (i.e, in another JVM) determines the bucket (based on hashCode()) to find that the element in question has been placed into another bucket because the hashCode() within the containing JVM has evaluated to another value. that's a pretty far-fetched, but not unimaginable, situation. From: Jason Hsueh jas...@google.com To: Jay Booth jaybo...@gmail.com Cc: Protocol Buffers protobuf@googlegroups.com Sent: Wed, May 18, 2011 12:00:19 PM Subject: Re: [protobuf] Re: Generated hashcode() returns different values across JVM instances? Jumping in late to the thread, and I'm not really a Java person, so I may be misunderstanding something here. But as far as I can tell, you are asking for hashCode() to be a 'consistent' hash across processes. hashCode() as implemented is still useful within a single JVM, allowing you to use protobufs in HashMaps based on content rather than object identity. That was the intended use case. On Wed, May 18, 2011 at 11:48 AM, Jay Booth jaybo...@gmail.com wrote: Well, that's your prerogative, I guess, but why even implement hashcode at all then? Just inherit from object and you're getting effectively the same behavior. Is that what you're intending? On May 16, 10:03 am, Pherl Liu liuj...@google.com wrote: We discussed internally and decided not to make the hashCode() return deterministic result. If you need consistent hashcode in different runs, use toByteString().hashCode(). Quoted from Kenton: Hashing the content of the descriptor would actually be incorrect, because two descriptors with exactly the same content are still considered different types. Descriptors are compared by identity, hence they are hashed by pointer. Removing the descriptor from the calculation would indeed make hashCode() consistent between two runs of the same binary, and probably insignificant runtime cost. Of course, once you do that, you will never be able to introduce non-determinism again because people will depend on it. But there's a much bigger risk. People may actually start depending on hashCode() returning consistent results between two different versions of the binary, or two completely separate binaries that compile in the same protocol, or -- most dangerously -- two different versions of the same protocol (e.g. with fields added or removed). I think it would be very difficult and limiting to make these guarantees, so I would be extremely cautious about this. Certainly, there is no implementation of hashCode() that would be any safer than .toByteString().hashCode(). So, I'd advise steering people to the latter. Note that if unknown fields are present, the results may still be inconsistent. However, there is no reasonable way to implement a hashCode() that is consistent in the presence of unknown fields. On Thu, May 12, 2011 at 5:32 AM, Ben Wright compuware...@gmail.com wrote: I think we wrote those replies at the same time : ) You're right, at the cost of some additional hash collisions, the simplest solution is to simply not include the type / descriptor in the hash calculation at all. The best / least-collision solutions with good performance would be what I wrote in my previous post, but that requires that someone (presumably a current committer) with sufficient knowledge of the Descriptor types to have enough time to update the compiler and java libraries accordingly. Any input from a committer for this issue? Seems the simple solution would take less than an hour to push into the stream and could make it into the next release. On May 11, 5:25 pm, Ben Wright compuware...@gmail.com wrote: Alternatively... instead of putting the onus on the compiler, the hashcode could be computed by the JVM at initialization time for the Descriptor instance, (which would also help performance of dynamically parsed Descriptor instance hashcode calls). i.e. private final int computedHashcode; public Descriptor() { //initialization computedHashcode = do_compute_hashCode(); } public int hashCode() { return computedHashcode; } punlic int do_compute_hashCode(){ return // compute hashcode } This is all talking towards optimum performance implementation... the real problem is the need for a hashCode implementation for Descriptor based on the actual Descriptor's content... On May 11, 4:54 pm, Ben Wright compuware...@gmail.com wrote: Jay: Using the class name to generate the