[HACKERS] Streaming solution and v3.1 protocol

2011-06-03 Thread Radosław Smogura
Hello,

Sorry for short introduction about this, and plese as far as possible, 
disconnet it from LOBs, as it on top of LOB.

Idea of streaming is to reduce memory copy mainly during receiving and sending 
tuples. Currently receive works as follows
1. Read bytes of tuple (allocate x memory).
2. Eventually convert it to database encoding.
3. Use this data and create datum (which is little changed copy of 1 or 2)
4. Streaming will be allowed only in binary mode, and actually stream in/out 
will return binary data.

Look for example at execution chain from textrecv.

Idea is to add stream_in, stream_out columns pg_type.

When value is to be serialized the sin/sout is called. Struct must pass len of 
data, and struct of stream (simillar to C FILE*).

Caller should validate if all bytes has been consumed (expose simple methods 
for this)

To implement(code API requirements):
First stream is buffered socekt reader.

Fast substreams - create fast stream limited to x bytes basing on other stream

Skipping bytes + skipAll()

Stream filtering - do fast (faster will be if conversion will occure in 
buffered chunks) encoding conversion.

Support for direct PG printf() version. Linux has ability to create cookie 
streams and use it with fprintf(), so its greate advantage to format huge 
strings. Other system should buffer output. Problem is if Linux cookie will 
fail will it write something to output? Windows proxy will push value to temp 
buffer.

Good idea may be to introduce new version of protocol reserving len field 
value 
(-2) for fixed size streams above 4GB
(-3) for chunked streaming - actualy is innovative functionality and it's not 
required by any driver.

In streaming it may be imagined that socket's fd will be passed to sin 
functions.

Problems: during output - something failed while writing. Resolution, add some 
control flags for each n-bytes send to client. This will prevent sending of 
e.g. 4GB of data if first byte filed, You send only n-bytes and then abort is 
received - or send data in frames.

Regards,
Radek

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming solution and v3.1 protocol

2011-06-03 Thread Heikki Linnakangas

On 03.06.2011 19:19, Radosław Smogura wrote:

Hello,

Sorry for short introduction about this, and plese as far as possible,
disconnet it from LOBs, as it on top of LOB.

Idea of streaming is to reduce memory copy mainly during receiving and sending
tuples. Currently receive works as follows
1. Read bytes of tuple (allocate x memory).
2. Eventually convert it to database encoding.
3. Use this data and create datum (which is little changed copy of 1 or 2)
4. Streaming will be allowed only in binary mode, and actually stream in/out
will return binary data.


Hmm, I was thinking that streaming would be a whole new mode, alongside 
the current text and binary mode.



Look for example at execution chain from textrecv.

Idea is to add stream_in, stream_out columns pg_type.

When value is to be serialized the sin/sout is called. Struct must pass len of
data, and struct of stream (simillar to C FILE*).

Caller should validate if all bytes has been consumed (expose simple methods
for this)

To implement(code API requirements):
First stream is buffered socekt reader.

Fast substreams - create fast stream limited to x bytes basing on other stream

Skipping bytes + skipAll()

Stream filtering - do fast (faster will be if conversion will occure in
buffered chunks) encoding conversion.

Support for direct PG printf() version. Linux has ability to create cookie
streams and use it with fprintf(), so its greate advantage to format huge
strings. Other system should buffer output. Problem is if Linux cookie will
fail will it write something to output? Windows proxy will push value to temp
buffer.


This is pretty low-level stuff, I think we should focus on the protocol 
changes and user-visible libpq API first.


However, we don't want to use anything Linux-specific here, so that 
cookie streams are not an option.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming solution and v3.1 protocol

2011-06-03 Thread Merlin Moncure
On Fri, Jun 3, 2011 at 12:04 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 03.06.2011 19:19, Radosław Smogura wrote:

 Hello,

 Sorry for short introduction about this, and plese as far as possible,
 disconnet it from LOBs, as it on top of LOB.

 Idea of streaming is to reduce memory copy mainly during receiving and
 sending
 tuples. Currently receive works as follows
 1. Read bytes of tuple (allocate x memory).
 2. Eventually convert it to database encoding.
 3. Use this data and create datum (which is little changed copy of 1 or 2)
 4. Streaming will be allowed only in binary mode, and actually stream
 in/out
 will return binary data.

 Hmm, I was thinking that streaming would be a whole new mode, alongside the
 current text and binary mode.

 Look for example at execution chain from textrecv.

 Idea is to add stream_in, stream_out columns pg_type.

 When value is to be serialized the sin/sout is called. Struct must pass
 len of
 data, and struct of stream (simillar to C FILE*).

 Caller should validate if all bytes has been consumed (expose simple
 methods
 for this)

 To implement(code API requirements):
 First stream is buffered socekt reader.

 Fast substreams - create fast stream limited to x bytes basing on other
 stream

 Skipping bytes + skipAll()

 Stream filtering - do fast (faster will be if conversion will occure in
 buffered chunks) encoding conversion.

 Support for direct PG printf() version. Linux has ability to create cookie
 streams and use it with fprintf(), so its greate advantage to format huge
 strings. Other system should buffer output. Problem is if Linux cookie
 will
 fail will it write something to output? Windows proxy will push value to
 temp
 buffer.

 This is pretty low-level stuff, I think we should focus on the protocol
 changes and user-visible libpq API first.

+1.  in particular, I'd like to see the libpq api changes.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers