Re: [protobuf] Message limit

2010-01-13 Thread Delip Rao
Thanks folks, that was very useful. Right now I have sequence of
messages since we're processing serially. RecordIO seems like a great
idea. Is the framing format just multiple messages in a file with an
inverted index in the beginning?

- Delip

On Tue, Jan 12, 2010 at 2:35 PM, Kenton Varda ken...@google.com wrote:
 So to rephrase what I said:  You should break up your message in multiple
 pieces that you store / send one at a time.  Usually very large messages are
 actually lists of smaller messages, so instead of using one big repeated
 field, store each message separately.  When storing to a file, it's probably
 advantageous to use a framing format that lets you store multiple
 records such that you can seek to any particular record quickly -- using a
 large repeated field doesn't provide this anyway, so you need something else
 (we have some code internally that we call RecordIO).
 BTW, we would love to open source the libraries I mentioned, it's just a
 matter of finding the time to get it done.

 On Tue, Jan 12, 2010 at 11:29 AM, Kenton Varda ken...@google.com wrote:

 Dang it, I got my mailing lists mixed up and referred to some things we
 haven't released open source.  Sigh.

 On Tue, Jan 12, 2010 at 11:28 AM, Kenton Varda ken...@google.com wrote:

 But you should consider a design that doesn't require you to send
 enormous messages.  Protocol buffers are not well-optimized for this sort of
 use.  For data stored on disk, consider storing multiple records in a
 RecordIO file.  For data passed over Stubby, consider streaming it in
 multiple pieces.

 On Tue, Jan 12, 2010 at 9:40 AM, Jason Hsueh jas...@google.com wrote:

 The limit applies to the data source from which a message is parsed. So
 if you want to parse a serialization of Foo, it applies to Foo. But if you
 parse a bunch of Bar messages one by one, and add them individually to Bar,
 then the limit only applies to each individual Bar.
 You can change the limit in your code if you create your own
 CodedInputStream and call its SetTotalBytesLimit method in C++, or its Java
 equivalent setSizeLimit.

 On Tue, Jan 12, 2010 at 8:41 AM, Delip Rao delip...@gmail.com wrote:

 Hi,

 I'm trying to understand protobuf message size limits. Is the 64M
 message limit fixed or can it be changed via some compile option? If I
 have a message Foo defined as:

 message Foo {
  repeated Bar bars = 1;
 }

 Will the limit apply to Foo or just the individual Bars?

 Thanks,
 Delip

 --
 You received this message because you are subscribed to the Google
 Groups Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.





 --
 You received this message because you are subscribed to the Google
 Groups Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.





-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Re: Issue 122 in protobuf: Two test failures on Windows

2010-01-13 Thread protobuf


Comment #14 on issue 122 by briford.wylie: Two test failures on Windows
http://code.google.com/p/protobuf/issues/detail?id=122

Hi, thanks for the tip. Worked fine.

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] In python - How would I send and receive a PB in the POST payload of http request?

2010-01-13 Thread Rich
I'm fairly new to python and very new to protocol buffers.  Any points
in the right direction would be helpful.

thx
-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] In python - How would I send and receive a PB in the POST payload of http request?

2010-01-13 Thread Kenton Varda
You'd need to use a separate HTTP library for that.  Protobuf itself doesn't
provide HTTP integration, but once you have the bytes from the HTTP payload
you can use protobuf to parse them.

On Wed, Jan 13, 2010 at 10:04 AM, Rich rz.li...@gmail.com wrote:

 I'm fairly new to python and very new to protocol buffers.  Any points
 in the right direction would be helpful.

 thx

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




-- 

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.

To post to this group, send email to proto...@googlegroups.com.

To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.



[protobuf] STL_HASH.m4

2010-01-13 Thread vikram
Hello Guys,

 I am seeing that google protocol buffer is now supporting
unorderd_map with new modification in hash.h . But I am confused where
exactly stl_hash.m4 looks for unordered_map by default . Can we make
it to look in different directly as xlc compiler on AIX is installed
under XYZ/vacpp/include which is different that default /usr/include
directory?

I tried to run m4 with stl_hash.m4 as input and XYZ/vacpp/include as
include directory but it failed. saying  end quote is not provided
Is there anyway I can make stl_hash.m4 to look into
different include file than /usr/include

Thanks  Regards,
Vikram
-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] STL_HASH.m4

2010-01-13 Thread Kenton Varda
stl_hash.m4 should automatically look it whatever directory your compiler
uses.  If for some reason your compiler does not automatically look in the
directory you want, then you should add the proper CXXFLAGS to make it look
there, e.g.:

  ./configure CXXFLAGS=-I/XYZ/vacpp/include

(-I is GCC's flag for this; your compiler may be different.)

On Wed, Jan 13, 2010 at 12:20 PM, vikram patilvik...@gmail.com wrote:

 Hello Guys,

 I am seeing that google protocol buffer is now supporting
 unorderd_map with new modification in hash.h . But I am confused where
 exactly stl_hash.m4 looks for unordered_map by default . Can we make
 it to look in different directly as xlc compiler on AIX is installed
 under XYZ/vacpp/include which is different that default /usr/include
 directory?

 I tried to run m4 with stl_hash.m4 as input and XYZ/vacpp/include as
 include directory but it failed. saying  end quote is not provided
 Is there anyway I can make stl_hash.m4 to look into
 different include file than /usr/include

 Thanks  Regards,
 Vikram

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




-- 

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.

To post to this group, send email to proto...@googlegroups.com.

To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.



[protobuf] How can I reset a FileInputStream?

2010-01-13 Thread Jacob Rief
Hello Kenton,

currently I have the following problem: I have a very big file with
many small messages serialized with Protobuf. Each message contains
its owner separator and thus can be found even in an unsynchronized
stream. I move through this file using lseek64, because
FileInputStream::Skip only works into forwarding direction and
FileInputStream::BackUp can move back only up to the current buffer
boundary. Since I am the owner of the file descriptor, also used by
FileInputStream, I can randomly seek to any position in the file.
However after seek'ing, my FileInputStream is obviously in an unusable
state and has to be reset. Currently the only feasible solution is to
replace the current FileInputStream object by a new one - which,
somehow is quite inefficient!

Wouldn't it make sense to add a member function which resets a
FileInputStream to the state of a natively opened and repositioned
file descriptor? Or is there any other solution to randomly access the
raw content of the file, say by wrapping seek?

Regards,
Jacob
-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] How can I reset a FileInputStream?

2010-01-13 Thread Kenton Varda
On Wed, Jan 13, 2010 at 3:02 PM, Jacob Rief jacob.r...@gmail.com wrote:

 Currently the only feasible solution is to
 replace the current FileInputStream object by a new one - which,
 somehow is quite inefficient!


What makes you think it is inefficient?  It does mean the buffer has to be
re-allocated but with a decent malloc implementation that shouldn't take
long.  Certainly the actual reading from the file would take longer.  Have
you seen performance problems with this approach?


 Wouldn't it make sense to add a member function which resets a
 FileInputStream to the state of a natively opened and repositioned
 file descriptor?


If there really is a performance problem with allocating new objects, then
sure.
-- 

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.

To post to this group, send email to proto...@googlegroups.com.

To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: WriteDelimited/parseDelimited in python

2010-01-13 Thread Kenton Varda
(I have this on the back burner as I'm kind of swamped, but I do want to get
this submitted at some point, hopefully within a week.)

On Tue, Jan 5, 2010 at 3:57 AM, Graham Cox cox.gra...@gmail.com wrote:

 I was saying the user *could* do that, and that it's currently what I'm
 doing in my server-side code. The reason being, as you said, if you naively
 read from a stream and the message isn't all present then you need to block
 until it is with the way that the Java code works at present. If you are
 using it for client-side code then likely this is not an issue in the
 slightest, but a server that needs to be able to handle many clients at once
 just can not block on one of them...

 As to your other alternative, (a), I would suggest that this leaves too
 much of the underlying network protocol bare to the caller. This will make
 it very difficult to change the way that delimiting messages happens in the
 future should such a thing be required. If - for example - it is decided to
 go from having the length prefixed to having a special delimiting sequence
 after the message then it will cause all current calling code to need to be
 changed. It might be that this is considered a low enough level library that
 this is acceptable, but that would be a Google decision...

 One more alternative would be how the asn1c library works for parsing ASN.1
 streams into objects, which is to be resumable. The decoder reads all the
 data it is given, and tries to build the object from this. If it doesn't
 have enough data yet then it does what it can, remembers where it got to and
 returns back to the user who can then supply more data when it becomes
 available. If the entire message does parse from the data provided then
 return back to the user the amount of data consumed so that they can discard
 this (reading from the stream directly makes this slightly cleaner still).
 At present, the Protobuf libraries (any of them) can not support this method
 of decoding an object, and it is not a trivial change to make it possible to
 do, but it does - IMO - give a much cleaner and easier to use method of use.
 --
 Graham Cox

 On Tue, Jan 5, 2010 at 1:32 AM, Kenton Varda ken...@google.com wrote:

 Make sure to reply all so that the group is CC'd.

 So you are saying that the user should read whatever data is on the
 socket, then attempt to parse it, and if it fails, assume that it's because
 there is more data to read?  Seems rather wasteful.  I think what we ideally
 want is either:
 (a) Provide a way for the caller to read the size independently, so that
 they can then make sure to read that many bytes from the input before
 parsing.
 (b) Provide a method that reads from a stream, so that the protobuf
 library can automatically take care of reading all necessary bytes.

 Option (b) is obviously cleaner but has a few problems:
 - We have to choose a particular stream interface to support.  While the
 Python file-like interface is pretty common I'm not sure if it's universal
 for this kind of task.
 - If not all bytes of the message are available yet, we'd have to block.
  This might be fine most of the time, but would be unacceptable for some
 uses.

 Thoughts?

 On Mon, Jan 4, 2010 at 3:09 PM, Graham Cox cox.gra...@gmail.com wrote:

 I'm using it for reading/writing to sockets in my functional tests -
 works well enough there...
 In my Java-side server code, I read from the socket into a byte buffer,
 then deserialize the byte buffer into Protobuf objects, throwing away the
 data that has been deserialized. The python MergeDelimitedFromString
 function also returns the number of bytes that were processed to build up
 the Protobuf object, so the user could easily do the same - read the socket
 onto the end of a buffer, and then while the buffer is successfully
 deserializing into objects throw away the first x bytes as appropriate...

 Just a thought :)

 On Mon, Jan 4, 2010 at 9:57 PM, Kenton Varda ken...@google.com wrote:

 Hmm, it occurs to me that this currently is not useful for reading from
 a socket or similar stream since the caller has to make sure to read an
 entire message before trying to parse it, but the caller doesn't actually
 know how long the message is (because the code that determines this is
 encapsulated).  Any thoughts on this?

 On Mon, Jan 4, 2010 at 12:11 PM, Kenton Varda ken...@google.comwrote:

 Mostly looks good.  There are some style issues (e.g. lines over 80
 chars) but I can clean those up myself.

 You'll need to sign the contributor license agreement:

 http://code.google.com/legal/individual-cla-v1.0.html -- If you own
 copyright on this change.
 http://code.google.com/legal/corporate-cla-v1.0.html -- If your
 employer does.

 Please let me know after you've done this and then I can submit these.


 On Fri, Jan 1, 2010 at 12:53 PM, Graham cox.gra...@gmail.com wrote:

 On Jan 1, 7:32 am, Kenton Varda ken...@google.com wrote:
  I don't think an equivalent has been added to the Python API.