[protobuf] Re: Issue 103 in protobuf: Protobuf 2.1.0 missing some sort of pthread linking?

2010-06-03 Thread protobuf


Comment #11 on issue 103 by l...@dashjr.org: Protobuf 2.1.0 missing some  
sort of pthread linking?

http://code.google.com/p/protobuf/issues/detail?id=103

It does seem to be fixed with that patch.

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Issue 195 in protobuf: common.h should not have "using namespace std;"

2010-06-03 Thread protobuf

Status: New
Owner: ken...@google.com
Labels: Type-Defect Priority-Medium

New issue 195 by jmccaskey: common.h should not have "using namespace std;"
http://code.google.com/p/protobuf/issues/detail?id=195

I have a project I'm considering using protocol buffers for, but this
project doesn't currently use the standard c++ container classes or
algorithms.  In fact we have some name collisions with the std namespace,
so in the few places we do use it we must always fully qualify names with
std::.

In testing out Protocol Buffers for our project we found that
google/protobuf/stubs/common.h which is included by the generated .pb.h
files ends up using the std namespace which then affects lots of our
downstream code negatively.

I've locally removed the using directive and added it to each .cpp instead,
fixing up the .h's to fully qualify names with std::.  Seems like it would
be best practice to remove the using directive from common.h in the
official build though instead of assuming everyone wants to have the std
namespace automatically used in their projects.  We'd love to see that
happen so merging new versions won't be a nightmare for us in the future if
we go with Protocol Buffers in our project.

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Questions/Ideas about Protobuf

2010-06-03 Thread dirtysalt1...@gmail.com

3ku for you reply.

Yes, actually I know this demand will add a great deal of complication. 
So I just want to know whether you designers have some idea about how to 
implement it.


en.actually I have test the performance of protobuf,thrift and some 
other opensource products. protobuf's package speed is higher and size 
of package is lower. I think protobuf is a great opensource stuff:)..


Linkedin:http://www.linkedin.com/in/dirlt


于 2010/6/4 1:40, Kenton Varda 写道:
That's correct.  Sorry, but encoding large chunks of data that cannot 
be parsed or serialized all at once is an explicit non-goal of 
protocol buffers.  Fulfilling such needs involves adding a great deal 
of complication to the system.  Your case may seem relatively simple, 
but other, more complicated cases require things like random access, 
search indexes, etc.  We chose to focus only on small messages.


It is certainly possible and useful to use protocol buffers as a 
building block when designing a format for large data sets.


On Thu, Jun 3, 2010 at 9:44 AM, dirtysalt1...@gmail.com 
 > wrote:


I think what you means that "I should redesign protocol which
implements stream functionality based on protobuf  INSTEAD OF
expecting protobuf to implement it."

What I used to thought is "App -> Protobuf -> Stream
Functionality[Protobuf provides stream functionality directly.On
the top, my app faces a large protobuf]"
And I think what you means is "App-> Stream Functionality ->
Protobuf[I have to implement stream by myself, but each stream
packet is based on protobuf. On the top,my app faces a lot of
small stream packets in protobuf]"

Linkedin:http://www.linkedin.com/in/dirlt


于 2010/6/4 0:21, Jason Hsueh 写道:

This really needs to be handled in the application since protobuf
has no idea which fields are expendable or can be truncated. What
I was trying to suggest earlier was to construct many Req
protobufs and serialize those individually. i.e., instead of 1
Req proto with 1,000,000 page ids, construct 1000 Req protos,
each containing 1000 page ids. You can serialize each of those
individually, stopping when you hit your memory budget.

That being said, I would suggest redesigning your protocol so
that you don't have to construct enormous messages. It sounds
like what you really want is something like the streaming
functionality in the rpc service - rather than sending one large
protobuf you would want to stream each page id.

On Thu, Jun 3, 2010 at 6:27 AM, dirlt mailto:dirtysalt1...@gmail.com>> wrote:

3ku for you relply:).For the first one,I think your answer is
quite
clear.

But to the second one,en,I want to produce the serialization
of Req.

Let me explain again:). assume my application is like this:
0.server app wants to send 1,000,000 pageids to client
1.if server app sends 1,000,000 pages id and serialize it, it
will
cost 1GB memory

2.but server app can just allocate 100MB memory. So obviously
server
app can't send all pageids[1,000,000] to client

3.meanwhile the server app's protobuf is very clever.
It[protobuf] can
calculate that "if server app has 100MB, it can just hold 10,000
pageids at most". So protobuf tells server that "Hi server,if
you just
have 100MB memory,I can only hold 10,000 pageids"

4.so the server app knows it,so app just serialize 10,000
pageids into
memory instead of 1,000,000 pageids.

I hope I clarify it now.. If the protobuf doesn't implement
it, do you
have any idea about it?.


On Jun 3, 12:40 am, Jason Hsueh mailto:jas...@google.com>> wrote:
> On Tue, Jun 1, 2010 at 6:21 AM, bnh mailto:baoneng...@gmail.com>> wrote:
> > I'm using a protobuf as the protocol for a distributed
system.But now
> > I
> > have some questions about protobuf
>
> > a.Whether protobuf provides the inteface for user-defined
allocator
> > because sometimes I find 'malloc' cost too much? I've
tried TCmalloc,
> > but I think I can optimize the memory allocation
according to my
> > application.
>
> No, there are no hooks for providing an allocator. You'd
need to override
> malloc the way TCmalloc does if you want to use your own
allocator.
>
>
>
>
>
> > b.Whethere protobuf provides a way to serialize a
class/object
> > partially[Or do you have some ideas about it]? Because my
application
> > is
> > very sensitive of memory usage.. Such as a class
>
> > class Req{
> > int userid;
> > vector pageid;
> 

Re: [protobuf] Issues with Large Coded Stream Files?

2010-06-03 Thread Nader Salehi
To be clear, I do not encode the entire file!  Each file contains many
small messages, each of which is stored as a length delimited record.
It is just that there are quite a few messages bundled in one file.

I'm assuming that Evan's assessment still stand?

Cheers,
Nader

On 6/3/2010 15:05 Kenton Varda writes:
> Note that writing a 100GB file using CodedStream is probably a bad idea
> because:
> - Readers will have to read the entire file sequentially; they will not be
> able to seek to particular parts.
> - One bit of corruption anywhere in the file could potentially render the
> entire rest of the file unreadable.
> 
> Remember that this stuff was designed for small messages.  You should really
> use some sort of seekable, fault-tolerant container format for 100GB of
> data.  You can still encode each individual message using protobufs, which
> is useful as it allows the container format to treat each message as a
> simple byte blob.
> 
> On Thu, Jun 3, 2010 at 12:43 PM, Evan Jones  wrote:
> 
> > On Jun 3, 2010, at 15:29 , Nader Salehi wrote:
> >
> >> It is not a single object; I am writing into a coded output stream
> >> file which could grow to much larger than 2GB (it's more like 100GB).
> >> I also have to read from this file.
> >>
> >> Is there a performance hit in the above-mentioned scenario?
> >>
> >
> > No, this should work just fine. On the input size, you'll need to call
> > CodedInputStream.resetSizeCounter() after each message, otherwise you'll run
> > into the size limit.
> >
> >
> > Evan
> >
> > --
> > Evan Jones
> > http://evanjones.ca/
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Protocol Buffers" group.
> > To post to this group, send email to proto...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > protobuf+unsubscr...@googlegroups.com
> > .
> > For more options, visit this group at
> > http://groups.google.com/group/protobuf?hl=en.
> >
> >

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Issues with Large Coded Stream Files?

2010-06-03 Thread Nader Salehi
It is not a single object; I am writing into a coded output stream
file which could grow to much larger than 2GB (it's more like 100GB).
I also have to read from this file.

Is there a performance hit in the above-mentioned scenario?

Nader


On 6/3/2010 15:03 Evan Jones writes:
> On Jun 3, 2010, at 14:18 , Nader Salehi wrote:
> > I was told that coded streams have issues when they are larger than
> > 2GB.  Is it true, and, if so, what are the issues?
> 
> If you have a single object that is 2GB in size, there are 32-bit  
> integers that will overflow. However, provided that you  
> call .resetSizeCounter() occasionally, I think it should work just  
> fine. I'm certainly using a single Java CodedInputStream per long  
> lived connection without any trouble. Unclear if I've sent > 2GB of  
> data over a single connection though.
> 
> Evan
> 
> --
> Evan Jones
> http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Questions/Ideas about Protobuf

2010-06-03 Thread dirtysalt1...@gmail.com

3ku for your reply:).

enActually my application is that I CAN allocate enough ram for the 
in-memory message object. What I really want is to restrict the size of 
messages which are sent/received by client/server. Right,now 
ZeroCopyOutputStream can work.


But actually my app is like a RPC system. each messages that be 
send/received can be parsed to a full but smaller object. I don't think 
ZeroCopyOutputStream can assure this point.


3ku for your hints. I will rethink about the message definition..:)..

Linkedin:http://www.linkedin.com/in/dirlt


于 2010/6/4 1:39, Jason Hsueh 写道:
Ah, one option I missed is using an implementation of 
io::ZeroCopyOutputStream like io::FileOutputStream, which uses a fixed 
size buffer and flushes data to the file (socket) when the buffer is 
full. Then serializing a large message won't consume a lot of memory. 
Perhaps this is what you really wanted, rather than truncating the 
message?


However, you still need enough ram for the in-memory message object, 
and those are typically larger than the serialized form. Also, this 
approach may or may not work with your RPC system. It is probably 
still worthwhile for you to look at reworking your message definition 
so that you transmit smaller messages.


On Thu, Jun 3, 2010 at 9:44 AM, dirtysalt1...@gmail.com 
 > wrote:


I think what you means that "I should redesign protocol which
implements stream functionality based on protobuf  INSTEAD OF
expecting protobuf to implement it."

What I used to thought is "App -> Protobuf -> Stream
Functionality[Protobuf provides stream functionality directly.On
the top, my app faces a large protobuf]"
And I think what you means is "App-> Stream Functionality ->
Protobuf[I have to implement stream by myself, but each stream
packet is based on protobuf. On the top,my app faces a lot of
small stream packets in protobuf]"

Linkedin:http://www.linkedin.com/in/dirlt


于 2010/6/4 0:21, Jason Hsueh 写道:

This really needs to be handled in the application since protobuf
has no idea which fields are expendable or can be truncated. What
I was trying to suggest earlier was to construct many Req
protobufs and serialize those individually. i.e., instead of 1
Req proto with 1,000,000 page ids, construct 1000 Req protos,
each containing 1000 page ids. You can serialize each of those
individually, stopping when you hit your memory budget.

That being said, I would suggest redesigning your protocol so
that you don't have to construct enormous messages. It sounds
like what you really want is something like the streaming
functionality in the rpc service - rather than sending one large
protobuf you would want to stream each page id.

On Thu, Jun 3, 2010 at 6:27 AM, dirlt mailto:dirtysalt1...@gmail.com>> wrote:

3ku for you relply:).For the first one,I think your answer is
quite
clear.

But to the second one,en,I want to produce the serialization
of Req.

Let me explain again:). assume my application is like this:
0.server app wants to send 1,000,000 pageids to client
1.if server app sends 1,000,000 pages id and serialize it, it
will
cost 1GB memory

2.but server app can just allocate 100MB memory. So obviously
server
app can't send all pageids[1,000,000] to client

3.meanwhile the server app's protobuf is very clever.
It[protobuf] can
calculate that "if server app has 100MB, it can just hold 10,000
pageids at most". So protobuf tells server that "Hi server,if
you just
have 100MB memory,I can only hold 10,000 pageids"

4.so the server app knows it,so app just serialize 10,000
pageids into
memory instead of 1,000,000 pageids.

I hope I clarify it now.. If the protobuf doesn't implement
it, do you
have any idea about it?.


On Jun 3, 12:40 am, Jason Hsueh mailto:jas...@google.com>> wrote:
> On Tue, Jun 1, 2010 at 6:21 AM, bnh mailto:baoneng...@gmail.com>> wrote:
> > I'm using a protobuf as the protocol for a distributed
system.But now
> > I
> > have some questions about protobuf
>
> > a.Whether protobuf provides the inteface for user-defined
allocator
> > because sometimes I find 'malloc' cost too much? I've
tried TCmalloc,
> > but I think I can optimize the memory allocation
according to my
> > application.
>
> No, there are no hooks for providing an allocator. You'd
need to override
> malloc the way TCmalloc does if you want to use your own
allocator.
>
>
>
>
>
> > b.Whethere protobuf provides a way to serialize a
class/object
> > partia

[protobuf] Re: Issue 103 in protobuf: Protobuf 2.1.0 missing some sort of pthread linking?

2010-06-03 Thread protobuf


Comment #10 on issue 103 by ken...@google.com: Protobuf 2.1.0 missing some  
sort of pthread linking?

http://code.google.com/p/protobuf/issues/detail?id=103

Could this actually be the same problem solved by comment 7 in issue 188?

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Issue 188 in protobuf: protobuf fails to link after compiling with LDFLAGS="-Wl,--as-needed" because of missing -lpthread

2010-06-03 Thread protobuf


Comment #10 on issue 188 by ken...@google.com: protobuf fails to link after  
compiling with LDFLAGS="-Wl,--as-needed" because of missing -lpthread

http://code.google.com/p/protobuf/issues/detail?id=188

Actually, I think it would be more intuitive to change the lines above to  
set "done"
to "no" in the case where linking an empty file worked.  In this case, we  
need to

continue the tests.

I'm going to get this changed properly.  I'll take care of getting gtest to  
update as

well.

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Issue 188 in protobuf: protobuf fails to link after compiling with LDFLAGS="-Wl,--as-needed" because of missing -lpthread

2010-06-03 Thread protobuf


Comment #9 on issue 188 by ken...@google.com: protobuf fails to link after  
compiling with LDFLAGS="-Wl,--as-needed" because of missing -lpthread

http://code.google.com/p/protobuf/issues/detail?id=188

BTW, you modified gtest's copy as well.  Can you file a bug against  
googletest to get

them to fix their copy?

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Issue 188 in protobuf: protobuf fails to link after compiling with LDFLAGS="-Wl,--as-needed" because of missing -lpthread

2010-06-03 Thread protobuf

Updates:
Status: Accepted

Comment #8 on issue 188 by ken...@google.com: protobuf fails to link after  
compiling with LDFLAGS="-Wl,--as-needed" because of missing -lpthread

http://code.google.com/p/protobuf/issues/detail?id=188

Thanks!  Will make sure this gets into the next release.

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Issue 192 in protobuf: C++0x conformance issue: using reserved keyword 'nullptr' as a name of variable

2010-06-03 Thread protobuf

Updates:
Status: Accepted
Owner: ken...@google.com

Comment #3 on issue 192 by ken...@google.com: C++0x conformance issue:  
using reserved keyword 'nullptr' as a name of variable

http://code.google.com/p/protobuf/issues/detail?id=192

Thanks, will include in next release.

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Linking Errors Using Sun Studio 12.1 on SPARC

2010-06-03 Thread Monty Taylor
On 06/03/2010 03:14 PM, Kenton Varda wrote:
> 
> 
> On Fri, May 28, 2010 at 11:25 AM, Hering  > wrote:
> 
> Hi,
> 
> I am trying to build versions 2.3.0 and 2.3.0rc2 with Sun Studio 12
> Update 1 compiler on a Solaris SPARC box without success.  The
> compilation goes well but the linking phase fails for both versions:
> 
> $ ( distro="/GAAL/chenher/share/protobuf/distro-sparc-2.3.0rc2-64bit";
> rm -rf ${distro}; sunstudio_home="/GAAL/chenher/share/sunstudio12.1";
> export PATH="/GAAL/chenher/bin:${JAVA_HOME}/bin:${sunstudio_home}/bin:/
> bin:/usr/bin:/usr/ccs/bin:/sbin:/usr/ucb:"; export LD_LIBRARY_PATH="/
> GAAL/chenher/share/sunstudio12.1/lib"; export CFLAGS="-m64 -
> xcode=pic32"; export CXXFLAGS="-m64 -xcode=pic32 -library=stlport4 -
> template=no%extdef"; export LDFLAGS="-m64"; ./configure --prefix=$
> {distro} "CFLAGS=${CFLAGS}" "CXXFLAGS=${CXXFLAGS}" "LDFLAGS=$
> {LDFLAGS}" )
> ...

Have you tried building it without injecting these extra CFLAGS and
CXXFLAGS? We've already got code in protobuf configure which builds with
stlport no%extdef. (We build with Sun Studio on Sparc and link to our
own project as part of the Drizzle project) ... I can't say for sure
_why_ you adding those flags in that way there would break - but if you
wouldn't mind trying it with no CFLAGS, CXXFLAGS or LDFLAGS in your env
and seeing if it works for you, I'd appreciate it - it will help track
down where the problem is for you.

> $ make clean all check install
> ...
> libtool: compile:  CC -DHAVE_CONFIG_H -I. -I.. -D_REENTRANT -
> xmemalign=8s -m64 -xcode=pic32 -library=stlport4 -template=no%extdef -
> c google/protobuf/compiler/parser.cc -o parser.o >/dev/null 2>&1
> /bin/bash ../libtool --tag=CXX   --mode=link CC -D_REENTRANT   -
> xmemalign=8s -m64 -xcode=pic32 -library=stlport4 -template=no%extdef -
> version-info 6:0:0 -export-dynamic -no-undefined -m64 -o
> libprotobuf.la  -rpath
> /GAAL/chenher/share/protobuf/distro-
> sparc-2.3.0rc2-64bit/lib/sparcv9 common.lo once.lo hash.lo
> extension_set.lo generated_message_util.lo message_lite.lo
> repeated_field.lo wire_format_lite.lo coded_stream.lo
> zero_copy_stream.lo zero_copy_stream_impl_lite.lo strutil.lo
> substitute.lo structurally_valid.lo descriptor.lo descriptor.pb.lo
> descriptor_database.lo dynamic_message.lo extension_set_heavy.lo
> generated_message_reflection.lo message.lo reflection_ops.lo
> service.lo text_format.lo unknown_field_set.lo wire_format.lo
> gzip_stream.lo printer.lo tokenizer.lo zero_copy_stream_impl.lo
> importer.lo parser.lo -lpthread -lz
> libtool: link: CC -G -zdefs -hlibprotobuf.so.6 -o .libs/libprotobuf.so.
> 6.0.0   .libs/common.o .libs/once.o .libs/hash.o .libs/
> extension_set.o .libs/generated_message_util.o .libs/
> message_lite.o .libs/repeated_field.o .libs/wire_format_lite.o .libs/
> coded_stream.o .libs/zero_copy_stream.o .libs/
> zero_copy_stream_impl_lite.o .libs/strutil.o .libs/substitute.o .libs/
> structurally_valid.o .libs/descriptor.o .libs/descriptor.pb.o .libs/
> descriptor_database.o .libs/dynamic_message.o .libs/
> extension_set_heavy.o .libs/generated_message_reflection.o .libs/
> message.o .libs/reflection_ops.o .libs/service.o .libs/
> text_format.o .libs/unknown_field_set.o .libs/wire_format.o .libs/
> gzip_stream.o .libs/printer.o .libs/tokenizer.o .libs/
> zero_copy_stream_impl.o .libs/importer.o .libs/parser.o   -
> library=stlport4 -lpthread -lz -lc   -m64 -m64
> Undefined   first referenced
>  symbol in file
> void __Crun::pure_error() .libs/common.o  (symbol belongs to
> implicit dependency /nfsdata/taqstore-tr18/chenher/share/sunstudio12.1/
> lib/sparc/64/libCrun.so.1)
> void*__Crun::simple_down_cast(void*,const
> __Crun::static_type_info*,const __Crun::static_type_info*) .libs/
> descriptor.o  (symbol belongs to implicit dependency /nfsdata/taqstore-
> tr18/chenher/share/sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void*operator new(unsigned long,void*)  .libs/
> common.o  (symbol belongs to implicit dependency /nfsdata/taqstore-
> tr18/chenher/share/sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void __Crun::ex_rethrow_q()   .libs/common.o  (symbol belongs to
> implicit dependency /nfsdata/taqstore-tr18/chenher/share/sunstudio12.1/
> lib/sparc/64/libCrun.so.1)
> void __Crun::register_exit_code(void(*)()extern"C") .libs/
> descriptor.o  (symbol belongs to implicit dependency /nfsdata/taqstore-
> tr18/chenher/share/sunstudio12.1/lib/sparc/64/libCrun.so.1)
> bool __Crun::ex_skip().libs/extension_set.o  (symbol
> belongs to implicit dependency /nfsdata/taqstore-tr18/chenher/share/
> sunstudio12.1/l

[protobuf] Re: Issue 194 in protobuf: Failed to compile protobuf::test with Visual Studio 2010: RepeatedFieldBackInsertIterator should be Assignable and Copy Constructible

2010-06-03 Thread protobuf

Updates:
Status: Accepted

Comment #1 on issue 194 by ken...@google.com: Failed to compile  
protobuf::test with Visual Studio 2010: RepeatedFieldBackInsertIterator  
should be Assignable and Copy Constructible

http://code.google.com/p/protobuf/issues/detail?id=194

I think we simply need to remove the "const" qualifier from field_.

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Issues with Large Coded Stream Files?

2010-06-03 Thread Kenton Varda
On Thu, Jun 3, 2010 at 3:15 PM, Nader Salehi  wrote:

> To be clear, I do not encode the entire file!  Each file contains many
> small messages, each of which is stored as a length delimited record.
> It is just that there are quite a few messages bundled in one file.
>

Right, that's what I was assuming.  My points still apply.  Unless you are
also building some sort of an index, using checksums, etc., this is an
extremely fragile format.


> I'm assuming that Evan's assessment still stand?
>

Sure.

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Linking Errors Using Sun Studio 12.1 on SPARC

2010-06-03 Thread Kenton Varda
On Fri, May 28, 2010 at 11:25 AM, Hering  wrote:

> Hi,
>
> I am trying to build versions 2.3.0 and 2.3.0rc2 with Sun Studio 12
> Update 1 compiler on a Solaris SPARC box without success.  The
> compilation goes well but the linking phase fails for both versions:
>
> $ ( distro="/GAAL/chenher/share/protobuf/distro-sparc-2.3.0rc2-64bit";
> rm -rf ${distro}; sunstudio_home="/GAAL/chenher/share/sunstudio12.1";
> export PATH="/GAAL/chenher/bin:${JAVA_HOME}/bin:${sunstudio_home}/bin:/
> bin:/usr/bin:/usr/ccs/bin:/sbin:/usr/ucb:"; export LD_LIBRARY_PATH="/
> GAAL/chenher/share/sunstudio12.1/lib"; export CFLAGS="-m64 -
> xcode=pic32"; export CXXFLAGS="-m64 -xcode=pic32 -library=stlport4 -
> template=no%extdef"; export LDFLAGS="-m64"; ./configure --prefix=$
> {distro} "CFLAGS=${CFLAGS}" "CXXFLAGS=${CXXFLAGS}" "LDFLAGS=$
> {LDFLAGS}" )
> ...
>
> $ make clean all check install
> ...
> libtool: compile:  CC -DHAVE_CONFIG_H -I. -I.. -D_REENTRANT -
> xmemalign=8s -m64 -xcode=pic32 -library=stlport4 -template=no%extdef -
> c google/protobuf/compiler/parser.cc -o parser.o >/dev/null 2>&1
> /bin/bash ../libtool --tag=CXX   --mode=link CC -D_REENTRANT   -
> xmemalign=8s -m64 -xcode=pic32 -library=stlport4 -template=no%extdef -
> version-info 6:0:0 -export-dynamic -no-undefined -m64 -o
> libprotobuf.la -rpath /GAAL/chenher/share/protobuf/distro-
> sparc-2.3.0rc2-64bit/lib/sparcv9 common.lo once.lo hash.lo
> extension_set.lo generated_message_util.lo message_lite.lo
> repeated_field.lo wire_format_lite.lo coded_stream.lo
> zero_copy_stream.lo zero_copy_stream_impl_lite.lo strutil.lo
> substitute.lo structurally_valid.lo descriptor.lo descriptor.pb.lo
> descriptor_database.lo dynamic_message.lo extension_set_heavy.lo
> generated_message_reflection.lo message.lo reflection_ops.lo
> service.lo text_format.lo unknown_field_set.lo wire_format.lo
> gzip_stream.lo printer.lo tokenizer.lo zero_copy_stream_impl.lo
> importer.lo parser.lo -lpthread -lz
> libtool: link: CC -G -zdefs -hlibprotobuf.so.6 -o .libs/libprotobuf.so.
> 6.0.0   .libs/common.o .libs/once.o .libs/hash.o .libs/
> extension_set.o .libs/generated_message_util.o .libs/
> message_lite.o .libs/repeated_field.o .libs/wire_format_lite.o .libs/
> coded_stream.o .libs/zero_copy_stream.o .libs/
> zero_copy_stream_impl_lite.o .libs/strutil.o .libs/substitute.o .libs/
> structurally_valid.o .libs/descriptor.o .libs/descriptor.pb.o .libs/
> descriptor_database.o .libs/dynamic_message.o .libs/
> extension_set_heavy.o .libs/generated_message_reflection.o .libs/
> message.o .libs/reflection_ops.o .libs/service.o .libs/
> text_format.o .libs/unknown_field_set.o .libs/wire_format.o .libs/
> gzip_stream.o .libs/printer.o .libs/tokenizer.o .libs/
> zero_copy_stream_impl.o .libs/importer.o .libs/parser.o   -
> library=stlport4 -lpthread -lz -lc   -m64 -m64
> Undefined   first referenced
>  symbol in file
> void __Crun::pure_error() .libs/common.o  (symbol belongs to
> implicit dependency /nfsdata/taqstore-tr18/chenher/share/sunstudio12.1/
> lib/sparc/64/libCrun.so.1)
> void*__Crun::simple_down_cast(void*,const
> __Crun::static_type_info*,const __Crun::static_type_info*) .libs/
> descriptor.o  (symbol belongs to implicit dependency /nfsdata/taqstore-
> tr18/chenher/share/sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void*operator new(unsigned long,void*)  .libs/
> common.o  (symbol belongs to implicit dependency /nfsdata/taqstore-
> tr18/chenher/share/sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void __Crun::ex_rethrow_q()   .libs/common.o  (symbol belongs to
> implicit dependency /nfsdata/taqstore-tr18/chenher/share/sunstudio12.1/
> lib/sparc/64/libCrun.so.1)
> void __Crun::register_exit_code(void(*)()extern"C") .libs/
> descriptor.o  (symbol belongs to implicit dependency /nfsdata/taqstore-
> tr18/chenher/share/sunstudio12.1/lib/sparc/64/libCrun.so.1)
> bool __Crun::ex_skip().libs/extension_set.o  (symbol
> belongs to implicit dependency /nfsdata/taqstore-tr18/chenher/share/
> sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void __Crun::ex_clean()   .libs/extension_set.o  (symbol
> belongs to implicit dependency /nfsdata/taqstore-tr18/chenher/share/
> sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void __Crun::ex_rethrow() .libs/extension_set.o  (symbol
> belongs to implicit dependency /nfsdata/taqstore-tr18/chenher/share/
> sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void operator delete(void*,void*)  .libs/
> extension_set.o  (symbol belongs to implicit dependency /nfsdata/
> taqstore-tr18/chenher/share/sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void*operator new[](unsigned long)   .libs/
> extension_set.o  (symbol belongs to implicit dependency /nfsdata/
> taqstore-tr18/chenher/share/sunstudio12.1/lib/sparc/64/libCrun.so.1)
> void*operator new(unsigned long)   .libs/common.o
> (symbol belongs to implicit dependency /nfsdata/taqsto

Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements

2010-06-03 Thread Kenton Varda
Please don't use reflection to reach into private internals of classes you
don't maintain.  We have "public" and "private" for a reason.  Furthermore,
this access may throw a SecurityException if a SecurityManager is in use.

On Mon, May 31, 2010 at 11:25 AM, David Dabbs  wrote:

>
> The approach I found worked the best:
>
> 1. Copy the string into a pre-allocated and re-used char[] array. This
> is needed since the JDK does not permit access to the String's
> char[] ,to enforce immutability. This is a performance "loss" VS the
> JDK, which can access the char[] directly
>
>
> Evan,
>
> you may access a String's internals via reflection in a "safe," albeit
> potentially implementation-specific way. See class code below.
> As long as your java.lang.String uses "value" for the char[] and
> "offset" for the storage offset, this should work.
> No sun.misc.Unsafe used. Only tested/used on JDK6.
>
>
> David
>
>
>
> /***
> **/
> import java.lang.reflect.*;
>
>
> public final class ReflectionUtils {
>
>/**
> * There is no explicit Modifier constant for package-privete, so 0 is
> used.
> */
>public static final int MODIFIER_PACKAGE_PRIVATE = 0x;
>
>
>/** Field object for accessing Sting::value character storage. */
>public static final Field STRING_VALUE_FIELD =
> getFieldAccessible(String.class, "value");
>
>
>/** Field object for accessing Sting::offset, the first index used in
> the value[] char storage. */
>public static final Field STRING_OFFSET_FIELD =
> getFieldAccessible(String.class, "offset");
>
>
>/**
> * Package private String constructor which shares value array for
> speed.
> *
> * Use when a number of String objects share the same char[] storage.
> *
> * See String(int offset, int count, char value[]).
> */
>public static final Constructor STRING_PP_CTOR =
> getConstructor(String.class, MODIFIER_PACKAGE_PRIVATE, int.class,
> int.class,
> char[].class);
>
>
>/**
> * To avoid violating final field semantics, take care to only _read_
> * the char[] value returned.
> */
> public static char[] getChars(final String s) {
>try {
>// use reflection to read the char[] value from the string. . .
>return (char[]) STRING_VALUE_FIELD.get(s);
>} catch (Throwable t) {
>return null;
>}
>}
>
>
>public static String sharedString(final char[] chars, final int offset,
> final int length) {
>try {
>return (String) STRING_PP_CTOR.newInstance(offset, length,
> chars);
>} catch (InstantiationException e) {
>e.printStackTrace();
>} catch (IllegalAccessException e) {
>e.printStackTrace();
>} catch (InvocationTargetException e) {
>e.printStackTrace();
>} catch (Throwable t) {
>t.printStackTrace();
>}
>return null;
>}
>
>
>public static Field getFieldAccessible(final Class clazz, final
> String fieldName) {
>Field fld = null;
>try {
>fld = clazz.getDeclaredField(fieldName);
>fld.setAccessible(true);
>} catch (NoSuchFieldException e) {
>e.printStackTrace();
>} catch (SecurityException e) {
>e.printStackTrace();
>}
>return fld;
>}
>
>
>public static Constructor getConstructor(final Class clazz, final
> int searchModifier, final Class... paramTypes) {
>
>if(clazz==null) {
>throw new IllegalArgumentException("A class parameter is
> required");
>}
>
>try {
>//
>// There is no explicit Modifier accessor constant for
> package-privete, so 0 is used.
>//
>
>for (Constructor ctor : clazz.getDeclaredConstructors()) {
>
>if (searchModifier == (ctor.getModifiers() &
> (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED))) {
>//
>// access modifier matches. . .
>//
>final Class[] parameterTypes =
> ctor.getParameterTypes();
>if (parameterTypes.length == paramTypes.length) {
>//
>// same number of parameters. . .
>//
>for (int i = 0; i < parameterTypes.length; i++) {
>if (!parameterTypes[i].equals(paramTypes[i])) {
>// one parameter not of correct type, so
> bail. . .
>// note ctor variable used as success marker
> below
>ctor = null;
>break;
>} else {
> //Type[] gpType =
> ctor.getGenericParameterTypes();
> //for (int j = 0; j < gpType.len

Re: [protobuf] Issues with Large Coded Stream Files?

2010-06-03 Thread Kenton Varda
Note that writing a 100GB file using CodedStream is probably a bad idea
because:
- Readers will have to read the entire file sequentially; they will not be
able to seek to particular parts.
- One bit of corruption anywhere in the file could potentially render the
entire rest of the file unreadable.

Remember that this stuff was designed for small messages.  You should really
use some sort of seekable, fault-tolerant container format for 100GB of
data.  You can still encode each individual message using protobufs, which
is useful as it allows the container format to treat each message as a
simple byte blob.

On Thu, Jun 3, 2010 at 12:43 PM, Evan Jones  wrote:

> On Jun 3, 2010, at 15:29 , Nader Salehi wrote:
>
>> It is not a single object; I am writing into a coded output stream
>> file which could grow to much larger than 2GB (it's more like 100GB).
>> I also have to read from this file.
>>
>> Is there a performance hit in the above-mentioned scenario?
>>
>
> No, this should work just fine. On the input size, you'll need to call
> CodedInputStream.resetSizeCounter() after each message, otherwise you'll run
> into the size limit.
>
>
> Evan
>
> --
> Evan Jones
> http://evanjones.ca/
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Issues with Large Coded Stream Files?

2010-06-03 Thread Evan Jones

On Jun 3, 2010, at 15:29 , Nader Salehi wrote:

It is not a single object; I am writing into a coded output stream
file which could grow to much larger than 2GB (it's more like 100GB).
I also have to read from this file.

Is there a performance hit in the above-mentioned scenario?


No, this should work just fine. On the input size, you'll need to call  
CodedInputStream.resetSizeCounter() after each message, otherwise  
you'll run into the size limit.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Issues with Large Coded Stream Files?

2010-06-03 Thread Evan Jones

On Jun 3, 2010, at 14:18 , Nader Salehi wrote:

I was told that coded streams have issues when they are larger than
2GB.  Is it true, and, if so, what are the issues?


If you have a single object that is 2GB in size, there are 32-bit  
integers that will overflow. However, provided that you  
call .resetSizeCounter() occasionally, I think it should work just  
fine. I'm certainly using a single Java CodedInputStream per long  
lived connection without any trouble. Unclear if I've sent > 2GB of  
data over a single connection though.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Issues with Large Coded Stream Files?

2010-06-03 Thread Nader Salehi
I was told that coded streams have issues when they are larger than
2GB.  Is it true, and, if so, what are the issues?

Cheers,
Nader

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Questions/Ideas about Protobuf

2010-06-03 Thread Kenton Varda
That's correct.  Sorry, but encoding large chunks of data that cannot be
parsed or serialized all at once is an explicit non-goal of protocol
buffers.  Fulfilling such needs involves adding a great deal of complication
to the system.  Your case may seem relatively simple, but other, more
complicated cases require things like random access, search indexes, etc.
 We chose to focus only on small messages.

It is certainly possible and useful to use protocol buffers as a building
block when designing a format for large data sets.

On Thu, Jun 3, 2010 at 9:44 AM, dirtysalt1...@gmail.com <
dirtysalt1...@gmail.com> wrote:

>  I think what you means that "I should redesign protocol which implements
> stream functionality based on protobuf  INSTEAD OF expecting protobuf to
> implement it."
>
> What I used to thought is "App -> Protobuf -> Stream Functionality[Protobuf
> provides stream functionality directly.On the top, my app faces a large
> protobuf]"
> And I think what you means is "App-> Stream Functionality -> Protobuf[I
> have to implement stream by myself, but each stream packet is based on
> protobuf. On the top,my app faces a lot of small stream packets in
> protobuf]"
>
> Linkedin:http://www.linkedin.com/in/dirlt
>
>
> 于 2010/6/4 0:21, Jason Hsueh 写道:
>
> This really needs to be handled in the application since protobuf has no
> idea which fields are expendable or can be truncated. What I was trying to
> suggest earlier was to construct many Req protobufs and serialize those
> individually. i.e., instead of 1 Req proto with 1,000,000 page ids,
> construct 1000 Req protos, each containing 1000 page ids. You can serialize
> each of those individually, stopping when you hit your memory budget.
>
>  That being said, I would suggest redesigning your protocol so that you
> don't have to construct enormous messages. It sounds like what you really
> want is something like the streaming functionality in the rpc service -
> rather than sending one large protobuf you would want to stream each page
> id.
>
> On Thu, Jun 3, 2010 at 6:27 AM, dirlt  wrote:
>
>> 3ku for you relply:).For the first one,I think your answer is quite
>> clear.
>>
>> But to the second one,en,I want to produce the serialization of Req.
>>
>> Let me explain again:). assume my application is like this:
>> 0.server app wants to send 1,000,000 pageids to client
>> 1.if server app sends 1,000,000 pages id and serialize it, it will
>> cost 1GB memory
>>
>> 2.but server app can just allocate 100MB memory. So obviously server
>> app can't send all pageids[1,000,000] to client
>>
>> 3.meanwhile the server app's protobuf is very clever. It[protobuf] can
>> calculate that "if server app has 100MB, it can just hold 10,000
>> pageids at most". So protobuf tells server that "Hi server,if you just
>> have 100MB memory,I can only hold 10,000 pageids"
>>
>> 4.so the server app knows it,so app just serialize 10,000 pageids into
>> memory instead of 1,000,000 pageids.
>>
>> I hope I clarify it now.. If the protobuf doesn't implement it, do you
>> have any idea about it?.
>>
>>
>> On Jun 3, 12:40 am, Jason Hsueh  wrote:
>>  > On Tue, Jun 1, 2010 at 6:21 AM, bnh  wrote:
>> > > I'm using a protobuf as the protocol for a distributed system.But now
>> > > I
>> > > have some questions about protobuf
>> >
>> > > a.Whether protobuf provides the inteface for user-defined allocator
>> > > because sometimes I find 'malloc' cost too much? I've tried TCmalloc,
>> > > but I think I can optimize the memory allocation according to my
>> > > application.
>> >
>> > No, there are no hooks for providing an allocator. You'd need to
>> override
>> > malloc the way TCmalloc does if you want to use your own allocator.
>> >
>> >
>> >
>> >
>> >
>> > > b.Whethere protobuf provides a way to serialize a class/object
>> > > partially[Or do you have some ideas about it]? Because my application
>> > > is
>> > > very sensitive of memory usage.. Such as a class
>> >
>> > > class Req{
>> > > int userid;
>> > > vector pageid;
>> > > };
>> >
>> > > I want to pack 1000 pageids into the Req. But if I pack all of them,
>> > > the
>> > > Req's size is about 1GB [hypothetically]. But I just have 100MB
>> > > memory,
>> > > so I just plan to pack pageids as many as possible until the memory
>> > > usage of Req is about 100MB. ['serialize object partially according to
>> > > memory usage'].
>> >
>> > Are you talking about producing the serialization of Req, with a large
>> > number of PageIds, or parsing such a serialization into an in-memory
>> object?
>> > For the former, you can serialize in smaller pieces, and just
>> concatenate
>> > the serializations:
>> http://code.google.com/apis/protocolbuffers/docs/encoding.html#optional
>> > For the latter, there is no way for you to tell the parser to stop
>> parsing
>> > when memory usage reaches a certain limit. However, you can do this
>> yourself
>> > if you split the serialization into multiple pieces.
>> >
>> >
>> >
>> > > --
>> > > You received this mes

Re: [protobuf] Re: Questions/Ideas about Protobuf

2010-06-03 Thread Jason Hsueh
Ah, one option I missed is using an implementation of
io::ZeroCopyOutputStream like io::FileOutputStream, which uses a fixed size
buffer and flushes data to the file (socket) when the buffer is full. Then
serializing a large message won't consume a lot of memory. Perhaps this is
what you really wanted, rather than truncating the message?

However, you still need enough ram for the in-memory message object, and
those are typically larger than the serialized form. Also, this approach may
or may not work with your RPC system. It is probably still worthwhile for
you to look at reworking your message definition so that you transmit
smaller messages.

On Thu, Jun 3, 2010 at 9:44 AM, dirtysalt1...@gmail.com <
dirtysalt1...@gmail.com> wrote:

>  I think what you means that "I should redesign protocol which implements
> stream functionality based on protobuf  INSTEAD OF expecting protobuf to
> implement it."
>
> What I used to thought is "App -> Protobuf -> Stream Functionality[Protobuf
> provides stream functionality directly.On the top, my app faces a large
> protobuf]"
> And I think what you means is "App-> Stream Functionality -> Protobuf[I
> have to implement stream by myself, but each stream packet is based on
> protobuf. On the top,my app faces a lot of small stream packets in
> protobuf]"
>
> Linkedin:http://www.linkedin.com/in/dirlt
>
>
> 于 2010/6/4 0:21, Jason Hsueh 写道:
>
> This really needs to be handled in the application since protobuf has no
> idea which fields are expendable or can be truncated. What I was trying to
> suggest earlier was to construct many Req protobufs and serialize those
> individually. i.e., instead of 1 Req proto with 1,000,000 page ids,
> construct 1000 Req protos, each containing 1000 page ids. You can serialize
> each of those individually, stopping when you hit your memory budget.
>
>  That being said, I would suggest redesigning your protocol so that you
> don't have to construct enormous messages. It sounds like what you really
> want is something like the streaming functionality in the rpc service -
> rather than sending one large protobuf you would want to stream each page
> id.
>
> On Thu, Jun 3, 2010 at 6:27 AM, dirlt  wrote:
>
>> 3ku for you relply:).For the first one,I think your answer is quite
>> clear.
>>
>> But to the second one,en,I want to produce the serialization of Req.
>>
>> Let me explain again:). assume my application is like this:
>> 0.server app wants to send 1,000,000 pageids to client
>> 1.if server app sends 1,000,000 pages id and serialize it, it will
>> cost 1GB memory
>>
>> 2.but server app can just allocate 100MB memory. So obviously server
>> app can't send all pageids[1,000,000] to client
>>
>> 3.meanwhile the server app's protobuf is very clever. It[protobuf] can
>> calculate that "if server app has 100MB, it can just hold 10,000
>> pageids at most". So protobuf tells server that "Hi server,if you just
>> have 100MB memory,I can only hold 10,000 pageids"
>>
>> 4.so the server app knows it,so app just serialize 10,000 pageids into
>> memory instead of 1,000,000 pageids.
>>
>> I hope I clarify it now.. If the protobuf doesn't implement it, do you
>> have any idea about it?.
>>
>>
>> On Jun 3, 12:40 am, Jason Hsueh  wrote:
>>  > On Tue, Jun 1, 2010 at 6:21 AM, bnh  wrote:
>> > > I'm using a protobuf as the protocol for a distributed system.But now
>> > > I
>> > > have some questions about protobuf
>> >
>> > > a.Whether protobuf provides the inteface for user-defined allocator
>> > > because sometimes I find 'malloc' cost too much? I've tried TCmalloc,
>> > > but I think I can optimize the memory allocation according to my
>> > > application.
>> >
>> > No, there are no hooks for providing an allocator. You'd need to
>> override
>> > malloc the way TCmalloc does if you want to use your own allocator.
>> >
>> >
>> >
>> >
>> >
>> > > b.Whethere protobuf provides a way to serialize a class/object
>> > > partially[Or do you have some ideas about it]? Because my application
>> > > is
>> > > very sensitive of memory usage.. Such as a class
>> >
>> > > class Req{
>> > > int userid;
>> > > vector pageid;
>> > > };
>> >
>> > > I want to pack 1000 pageids into the Req. But if I pack all of them,
>> > > the
>> > > Req's size is about 1GB [hypothetically]. But I just have 100MB
>> > > memory,
>> > > so I just plan to pack pageids as many as possible until the memory
>> > > usage of Req is about 100MB. ['serialize object partially according to
>> > > memory usage'].
>> >
>> > Are you talking about producing the serialization of Req, with a large
>> > number of PageIds, or parsing such a serialization into an in-memory
>> object?
>> > For the former, you can serialize in smaller pieces, and just
>> concatenate
>> > the serializations:
>> http://code.google.com/apis/protocolbuffers/docs/encoding.html#optional
>> > For the latter, there is no way for you to tell the parser to stop
>> parsing
>> > when memory usage reaches a certain limit. However, you can do this
>>

Re: [protobuf] Re: Questions/Ideas about Protobuf

2010-06-03 Thread dirtysalt1...@gmail.com
I think what you means that "I should redesign protocol which implements 
stream functionality based on protobuf  INSTEAD OF expecting protobuf to 
implement it."


What I used to thought is "App -> Protobuf -> Stream 
Functionality[Protobuf provides stream functionality directly.On the 
top, my app faces a large protobuf]"
And I think what you means is "App-> Stream Functionality -> Protobuf[I 
have to implement stream by myself, but each stream packet is based on 
protobuf. On the top,my app faces a lot of small stream packets in 
protobuf]"


Linkedin:http://www.linkedin.com/in/dirlt


? 2010/6/4 0:21, Jason Hsueh ??:
This really needs to be handled in the application since protobuf has 
no idea which fields are expendable or can be truncated. What I was 
trying to suggest earlier was to construct many Req protobufs and 
serialize those individually. i.e., instead of 1 Req proto with 
1,000,000 page ids, construct 1000 Req protos, each containing 1000 
page ids. You can serialize each of those individually, stopping when 
you hit your memory budget.


That being said, I would suggest redesigning your protocol so that you 
don't have to construct enormous messages. It sounds like what you 
really want is something like the streaming functionality in the rpc 
service - rather than sending one large protobuf you would want to 
stream each page id.


On Thu, Jun 3, 2010 at 6:27 AM, dirlt > wrote:


3ku for you relply:).For the first one,I think your answer is quite
clear.

But to the second one,en,I want to produce the serialization of Req.

Let me explain again:). assume my application is like this:
0.server app wants to send 1,000,000 pageids to client
1.if server app sends 1,000,000 pages id and serialize it, it will
cost 1GB memory

2.but server app can just allocate 100MB memory. So obviously server
app can't send all pageids[1,000,000] to client

3.meanwhile the server app's protobuf is very clever. It[protobuf] can
calculate that "if server app has 100MB, it can just hold 10,000
pageids at most". So protobuf tells server that "Hi server,if you just
have 100MB memory,I can only hold 10,000 pageids"

4.so the server app knows it,so app just serialize 10,000 pageids into
memory instead of 1,000,000 pageids.

I hope I clarify it now.. If the protobuf doesn't implement it, do you
have any idea about it?.


On Jun 3, 12:40 am, Jason Hsueh mailto:jas...@google.com>> wrote:
> On Tue, Jun 1, 2010 at 6:21 AM, bnh mailto:baoneng...@gmail.com>> wrote:
> > I'm using a protobuf as the protocol for a distributed
system.But now
> > I
> > have some questions about protobuf
>
> > a.Whether protobuf provides the inteface for user-defined
allocator
> > because sometimes I find 'malloc' cost too much? I've tried
TCmalloc,
> > but I think I can optimize the memory allocation according to my
> > application.
>
> No, there are no hooks for providing an allocator. You'd need to
override
> malloc the way TCmalloc does if you want to use your own allocator.
>
>
>
>
>
> > b.Whethere protobuf provides a way to serialize a class/object
> > partially[Or do you have some ideas about it]? Because my
application
> > is
> > very sensitive of memory usage.. Such as a class
>
> > class Req{
> > int userid;
> > vector pageid;
> > };
>
> > I want to pack 1000 pageids into the Req. But if I pack all of
them,
> > the
> > Req's size is about 1GB [hypothetically]. But I just have 100MB
> > memory,
> > so I just plan to pack pageids as many as possible until the
memory
> > usage of Req is about 100MB. ['serialize object partially
according to
> > memory usage'].
>
> Are you talking about producing the serialization of Req, with a
large
> number of PageIds, or parsing such a serialization into an
in-memory object?
> For the former, you can serialize in smaller pieces, and just
concatenate
> the

serializations:http://code.google.com/apis/protocolbuffers/docs/encoding.html#optional
> For the latter, there is no way for you to tell the parser to
stop parsing
> when memory usage reaches a certain limit. However, you can do
this yourself
> if you split the serialization into multiple pieces.
>
>
>
> > --
> > You received this message because you are subscribed to the
Google Groups
> > "Protocol Buffers" group.
> > To post to this group, send email to protobuf@googlegroups.com
.
> > To unsubscribe from this group, send email to
> > protobuf+unsubscr...@googlegroups.com

mailto:protobuf%252bunsubscr...@googlegroups.com>>
> > .
> > For more options, visit this group at
> >http://groups.goog

Re: [protobuf] Re: Questions/Ideas about Protobuf

2010-06-03 Thread Jason Hsueh
This really needs to be handled in the application since protobuf has no
idea which fields are expendable or can be truncated. What I was trying to
suggest earlier was to construct many Req protobufs and serialize those
individually. i.e., instead of 1 Req proto with 1,000,000 page ids,
construct 1000 Req protos, each containing 1000 page ids. You can serialize
each of those individually, stopping when you hit your memory budget.

That being said, I would suggest redesigning your protocol so that you don't
have to construct enormous messages. It sounds like what you really want is
something like the streaming functionality in the rpc service - rather than
sending one large protobuf you would want to stream each page id.

On Thu, Jun 3, 2010 at 6:27 AM, dirlt  wrote:

> 3ku for you relply:).For the first one,I think your answer is quite
> clear.
>
> But to the second one,en,I want to produce the serialization of Req.
>
> Let me explain again:). assume my application is like this:
> 0.server app wants to send 1,000,000 pageids to client
> 1.if server app sends 1,000,000 pages id and serialize it, it will
> cost 1GB memory
>
> 2.but server app can just allocate 100MB memory. So obviously server
> app can't send all pageids[1,000,000] to client
>
> 3.meanwhile the server app's protobuf is very clever. It[protobuf] can
> calculate that "if server app has 100MB, it can just hold 10,000
> pageids at most". So protobuf tells server that "Hi server,if you just
> have 100MB memory,I can only hold 10,000 pageids"
>
> 4.so the server app knows it,so app just serialize 10,000 pageids into
> memory instead of 1,000,000 pageids.
>
> I hope I clarify it now.. If the protobuf doesn't implement it, do you
> have any idea about it?.
>
>
> On Jun 3, 12:40 am, Jason Hsueh  wrote:
> > On Tue, Jun 1, 2010 at 6:21 AM, bnh  wrote:
> > > I'm using a protobuf as the protocol for a distributed system.But now
> > > I
> > > have some questions about protobuf
> >
> > > a.Whether protobuf provides the inteface for user-defined allocator
> > > because sometimes I find 'malloc' cost too much? I've tried TCmalloc,
> > > but I think I can optimize the memory allocation according to my
> > > application.
> >
> > No, there are no hooks for providing an allocator. You'd need to override
> > malloc the way TCmalloc does if you want to use your own allocator.
> >
> >
> >
> >
> >
> > > b.Whethere protobuf provides a way to serialize a class/object
> > > partially[Or do you have some ideas about it]? Because my application
> > > is
> > > very sensitive of memory usage.. Such as a class
> >
> > > class Req{
> > > int userid;
> > > vector pageid;
> > > };
> >
> > > I want to pack 1000 pageids into the Req. But if I pack all of them,
> > > the
> > > Req's size is about 1GB [hypothetically]. But I just have 100MB
> > > memory,
> > > so I just plan to pack pageids as many as possible until the memory
> > > usage of Req is about 100MB. ['serialize object partially according to
> > > memory usage'].
> >
> > Are you talking about producing the serialization of Req, with a large
> > number of PageIds, or parsing such a serialization into an in-memory
> object?
> > For the former, you can serialize in smaller pieces, and just concatenate
> > the serializations:
> http://code.google.com/apis/protocolbuffers/docs/encoding.html#optional
> > For the latter, there is no way for you to tell the parser to stop
> parsing
> > when memory usage reaches a certain limit. However, you can do this
> yourself
> > if you split the serialization into multiple pieces.
> >
> >
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Protocol Buffers" group.
> > > To post to this group, send email to proto...@googlegroups.com.
> > > To unsubscribe from this group, send email to
> > > protobuf+unsubscr...@googlegroups.com
> 
> >
> > > .
> > > For more options, visit this group at
> > >http://groups.google.com/group/protobuf?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Questions/Ideas about Protobuf

2010-06-03 Thread dirlt
3ku for you relply:).For the first one,I think your answer is quite
clear.

But to the second one,en,I want to produce the serialization of Req.

Let me explain again:). assume my application is like this:
0.server app wants to send 1,000,000 pageids to client
1.if server app sends 1,000,000 pages id and serialize it, it will
cost 1GB memory

2.but server app can just allocate 100MB memory. So obviously server
app can't send all pageids[1,000,000] to client

3.meanwhile the server app's protobuf is very clever. It[protobuf] can
calculate that "if server app has 100MB, it can just hold 10,000
pageids at most". So protobuf tells server that "Hi server,if you just
have 100MB memory,I can only hold 10,000 pageids"

4.so the server app knows it,so app just serialize 10,000 pageids into
memory instead of 1,000,000 pageids.

I hope I clarify it now.. If the protobuf doesn't implement it, do you
have any idea about it?.


On Jun 3, 12:40 am, Jason Hsueh  wrote:
> On Tue, Jun 1, 2010 at 6:21 AM, bnh  wrote:
> > I'm using a protobuf as the protocol for a distributed system.But now
> > I
> > have some questions about protobuf
>
> > a.Whether protobuf provides the inteface for user-defined allocator
> > because sometimes I find 'malloc' cost too much? I've tried TCmalloc,
> > but I think I can optimize the memory allocation according to my
> > application.
>
> No, there are no hooks for providing an allocator. You'd need to override
> malloc the way TCmalloc does if you want to use your own allocator.
>
>
>
>
>
> > b.Whethere protobuf provides a way to serialize a class/object
> > partially[Or do you have some ideas about it]? Because my application
> > is
> > very sensitive of memory usage.. Such as a class
>
> > class Req{
> > int userid;
> > vector pageid;
> > };
>
> > I want to pack 1000 pageids into the Req. But if I pack all of them,
> > the
> > Req's size is about 1GB [hypothetically]. But I just have 100MB
> > memory,
> > so I just plan to pack pageids as many as possible until the memory
> > usage of Req is about 100MB. ['serialize object partially according to
> > memory usage'].
>
> Are you talking about producing the serialization of Req, with a large
> number of PageIds, or parsing such a serialization into an in-memory object?
> For the former, you can serialize in smaller pieces, and just concatenate
> the 
> serializations:http://code.google.com/apis/protocolbuffers/docs/encoding.html#optional
> For the latter, there is no way for you to tell the parser to stop parsing
> when memory usage reaches a certain limit. However, you can do this yourself
> if you split the serialization into multiple pieces.
>
>
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Protocol Buffers" group.
> > To post to this group, send email to proto...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > protobuf+unsubscr...@googlegroups.com
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.