Re: [protobuf] modifying lists within a message

2010-03-17 Thread Ron Reynolds
seems odd that you can delete all via clearXXX() or add via addAllXXX() and 
addXXX() but not remove a single element by index...

well, here's a work-around - if you put a utility class into the same package 
as the ProtobufMessages generated class it can access the protected 
Builder.internalGetResult() - since package-level access is higher than 
protected-level access...





From: Jason Hsueh jas...@google.com
To: Ron tequila...@ymail.com
Cc: Protocol Buffers protobuf@googlegroups.com
Sent: Wed, March 17, 2010 9:31:58 AM
Subject: Re: [protobuf] modifying lists within a message

Messages are indeed immutable once built. getXXXList() for both the builder and 
the message should return unmodifiable lists: if you try to modify them you 
should get an exception. If you haven't already, take a look at the Java 
tutorial here: 
http://code.google.com/apis/protocolbuffers/docs/javatutorial.html

For your particular problem I think you'll need do a clearXXX() on the entire 
list, and add back all the elements except the entry you want to discard.


On Wed, Mar 17, 2010 at 7:57 AM, Ron tequila...@ymail.com wrote:

i didn't see anything on this in the archives but i admit i didn't do
an exhaustive search.  i'm using protobuf (2.3.0) to store objects in
a Voldemort cluster and am working on the code to modify these
objects.  so i load the message fine, but it is composed of a list of
other messages and i'd like to modify the contents of that list and
then stream it back into the cluster.  i assumed this is what
Type.newBuilder(Type) was for, but have run into an interesting
wrinkle trying to modify the list (once i've located the entry i want
to remove by index).  Builder.getXXXList() returns an
unmodifiableList, but the actual message's getXXXList() returns a ref
to the actual list (i.e., i could modify the list in the message, but
not the list within the message within the builder).  i haven't gone
through all the code yet but this seemed rather odd since i had
thought messages within builders were mutable until build() but once
built were considered immutable (tho perhaps that's just in my head).

is there a doc showing best practices for modifying Protobuf messages
i should be reading?

thanks,
...ron.

--
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



  

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] modifying lists within a message

2010-03-17 Thread Ron Reynolds
but this message isn't built yet - i'm trying to modify a copy of a previous 
message wrapping it in a newBuilder so i don't have to copy all the fields over 
from the previous (immutable) to the new (mutable until i call 
Builder.build()).  the only thing (so far) that i can't do to that new message 
is remove a single element from its list variable.  

you mean it's O(n) because of the shift of all elements after the index since 
ArrayList is a continuous block?  i wouldn't have called that O(n).  certainly 
it's O(n) to find the element in the unsorted list, but that's unavoidable 
(unless i add code to put the elements in sorted...)  once i find the index of 
the single element in the list i want i have to remove it and then add a new 
(copied and then modified from the original) element to replace it.  (order is 
unimportant, at least in this case).

and yes, it is true - classes in the same package have access to each others 
protected members.  package and protected might MEAN something different 
but the way the JVM (the Sun JVM anyway) acts the access level is an int:  0 = 
public, 1 = protected, 2 = package, 3 = private - you can access everything = 
your access-level with respect to the other class.  t'would be a pity for you 
to fix it - i was rather happy to figure that one out. :)  in which case 
wiping the entire list and inserting N-1 elements back in will be my only 
(non-reflection-based) option (which i'm pretty sure is more costly than 
removing a single element from an ArrayList by index...).





From: Kenton Varda ken...@google.com
To: Ron Reynolds tequila...@ymail.com
Cc: protobuf@googlegroups.com
Sent: Wed, March 17, 2010 1:32:50 PM
Subject: Re: [protobuf] modifying lists within a message

Also note that we cannot return a modifiable list even from the Builder's 
getter because the caller could then hold on to that list and modify it after 
the message is built.  We really need to prevent any kind of modifications from 
happening after the message is built.


On Wed, Mar 17, 2010 at 1:31 PM, Kenton Varda ken...@google.com wrote:




On Wed, Mar 17, 2010 at 12:59 PM, Ron Reynolds tequila...@ymail.com wrote:



seems odd that you can delete all via clearXXX() or add via addAllXXX() and 
addXXX() but not remove a single element by index...



Our experience is that if we provide a way to remove a single element by 
index, people will mistakenly think that the method is more efficient than 
clearing the whole list and rebuilding it, when in fact both operations are 
O(n).  The result of this is that people write loops like:


  for (all elements) {
if (shouldFilter(element)) {
  remove(element);
}
  }


This loop is O(n^2).

 
well, here's a work-around - if you put a utility class into the same package 
as the ProtobufMessages generated class it can access the protected 
Builder.internalGetResult() - since package-level access is higher than 
protected-level access...



Yikes, is that true?  I don't think it is, but if so we need to fix it.  I'm 
pretty sure protected is neither higher nor lower than package-level -- it 
means something entirely different.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



  

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Generated hashcode() returns different values across JVM instances?

2011-05-18 Thread Ron Reynolds
the corner-stone of Hash* containers is:
 (A.equals(B)) = (A.hashCode() == B.hashCode()) for all A, B.  

tho it's not explicitly stated it would seem to be implied that is within a 
single JVM.
not sure if the code in question maintains that rule within a JVM (if not 
that's 
a big deal).  
if so that would seem sufficient for all but the most distributed of Hash* 
containers, such as where a client which is remote from the storage (i.e, in 
another JVM) determines the bucket (based on hashCode()) to find that the 
element in question has been placed into another bucket because the hashCode() 
within the containing JVM has evaluated to another value.  that's a pretty 
far-fetched, but not unimaginable, situation.




From: Jason Hsueh jas...@google.com
To: Jay Booth jaybo...@gmail.com
Cc: Protocol Buffers protobuf@googlegroups.com
Sent: Wed, May 18, 2011 12:00:19 PM
Subject: Re: [protobuf] Re: Generated hashcode() returns different values 
across 
JVM instances?

Jumping in late to the thread, and I'm not really a Java person, so I may be 
misunderstanding something here. But as far as I can tell, you are asking for 
hashCode() to be a 'consistent' hash across processes. hashCode() as 
implemented 
is still useful within a single JVM, allowing you to use protobufs in HashMaps 
based on content rather than object identity. That was the intended use case.


On Wed, May 18, 2011 at 11:48 AM, Jay Booth jaybo...@gmail.com wrote:

Well, that's your prerogative, I guess, but why even implement
hashcode at all then?  Just inherit from object and you're getting
effectively the same behavior.  Is that what you're intending?


On May 16, 10:03 am, Pherl Liu liuj...@google.com wrote:
 We discussed internally and decided not to make the hashCode()
 return deterministic result. If you need consistent hashcode in different
 runs, use toByteString().hashCode().

 Quoted from Kenton:

 Hashing the content of the descriptor would actually be incorrect, because
 two descriptors with exactly the same content are still considered different
 types.  Descriptors are compared by identity, hence they are hashed by
 pointer.

 Removing the descriptor from the calculation would indeed make hashCode()
 consistent between two runs of the same binary, and probably insignificant
 runtime cost.  Of course, once you do that, you will never be able to
 introduce non-determinism again because people will depend on it.

 But there's a much bigger risk.  People may actually start depending on
 hashCode() returning consistent results between two different versions of
 the binary, or two completely separate binaries that compile in the same
 protocol, or -- most dangerously -- two different versions of the same
 protocol (e.g. with fields added or removed).  I think it would be very
 difficult and limiting to make these guarantees, so I would be extremely
 cautious about this.

 Certainly, there is no implementation of hashCode() that would be any safer
 than .toByteString().hashCode().  So, I'd advise steering people to the
 latter.  Note that if unknown fields are present, the results may still be
 inconsistent.  However, there is no reasonable way to implement a hashCode()
 that is consistent in the presence of unknown fields.








 On Thu, May 12, 2011 at 5:32 AM, Ben Wright compuware...@gmail.com wrote:
  I think we wrote those replies at the same time : )

  You're right, at the cost of some additional hash collisions, the
  simplest solution is to simply not include the type / descriptor in
  the hash calculation at all.

  The best / least-collision solutions with good performance would be
  what I wrote in my previous post, but that requires that someone
  (presumably a current committer) with sufficient knowledge of the
  Descriptor types to have enough time to update the compiler and java
  libraries accordingly.

  Any input from a committer for this issue?  Seems the simple solution
  would take less than an hour to push into the stream and could make it
  into the next release.

  On May 11, 5:25 pm, Ben Wright compuware...@gmail.com wrote:
   Alternatively... instead of putting the onus on the compiler, the
   hashcode could be computed by the JVM at initialization time for the
   Descriptor instance, (which would also help performance of dynamically
   parsed Descriptor instance hashcode calls).

   i.e.

   private final int computedHashcode;

   public Descriptor() {
  //initialization

 computedHashcode = do_compute_hashCode();

   }

   public int hashCode() {
   return computedHashcode;

   }

   punlic int do_compute_hashCode(){
 return // compute hashcode

   }

   This is all talking towards optimum performance implementation... the
   real problem is the need for a hashCode implementation for Descriptor
   based on the actual Descriptor's content...

   On May 11, 4:54 pm, Ben Wright compuware...@gmail.com wrote:

Jay:

Using the class name to generate the