Re: [HACKERS] Toast, Text, blob bytea Huh?

2001-09-10 Thread Ryan

[EMAIL PROTECTED] (Joe Conway) wrote in message 
news:02a101c12c0e$37856d60$48d210ac@jecw2k1...
... snip
 BYTEA is very similar to TEXT, except that it is intended for binary data. I
 recently posted a PHP function which escapes binary in order to allow
 inserting it into a bytea column (on the pgsql-general list). At a minimum
 there are 4 characters which must be escaped. They are ACSII 0, 10, 39, and
 92. They must reach PostgreSQL looking like \\000, \\012, \\047, and \\134
 respectively (actually 39 could be \' and 92 could be , but it is
 simpler to be consistent).
... snip

Is it actually necessary to escape \012 (linefeed) in a query?  My
brief testing using psql or python pygresql seemed to work ok with
only \000, \', and \\ escaped.  Gosh, maybe all my data is corrupted
(!!)

Ryan

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Toast, Text, blob bytea Huh?

2001-09-10 Thread Joe Conway

 [EMAIL PROTECTED] (Joe Conway) wrote in message
news:02a101c12c0e$37856d60$48d210ac@jecw2k1...
 ... snip
  BYTEA is very similar to TEXT, except that it is intended for binary
data. I
  recently posted a PHP function which escapes binary in order to allow
  inserting it into a bytea column (on the pgsql-general list). At a
minimum
  there are 4 characters which must be escaped. They are ACSII 0, 10, 39,
and
  92. They must reach PostgreSQL looking like \\000, \\012, \\047, and
\\134
  respectively (actually 39 could be \' and 92 could be , but it is
  simpler to be consistent).
 ... snip

 Is it actually necessary to escape \012 (linefeed) in a query?  My
 brief testing using psql or python pygresql seemed to work ok with
 only \000, \', and \\ escaped.  Gosh, maybe all my data is corrupted
 (!!)


Sorry the response is a week late, but your post just hit the list (at least
I just got it). I found after sending this, that the problem with linefeeds
was in my PHP code, so you should be OK :-)

Sorry for the confusion I may have caused!

Joe



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Toast, Text, blob bytea Huh?

2001-09-10 Thread Bruce Momjian

 [EMAIL PROTECTED] (Joe Conway) wrote in message 
news:02a101c12c0e$37856d60$48d210ac@jecw2k1...
 ... snip
  BYTEA is very similar to TEXT, except that it is intended for binary data. I
  recently posted a PHP function which escapes binary in order to allow
  inserting it into a bytea column (on the pgsql-general list). At a minimum
  there are 4 characters which must be escaped. They are ACSII 0, 10, 39, and
  92. They must reach PostgreSQL looking like \\000, \\012, \\047, and \\134
  respectively (actually 39 could be \' and 92 could be , but it is
  simpler to be consistent).
 ... snip
 
 Is it actually necessary to escape \012 (linefeed) in a query?  My
 brief testing using psql or python pygresql seemed to work ok with
 only \000, \', and \\ escaped.  Gosh, maybe all my data is corrupted
 (!!)

The linefeed escape was reported by a PHP users and perhaps there is an
issue with PHP only.  Not sure.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Toast, Text, blob bytea Huh?

2001-08-24 Thread Peter T Mount

Quoting Joe Conway [EMAIL PROTECTED]:

 TEXT is a datatype which stores character data of unspecified length (up
 to
 the max value of a 4 byte integer in length, although I've seen
 comments
 indicating that the practical limit is closer to 1 GB -- not sure why).

It may be something to do with the 1Gb splitting of the physical files 
representing a table... Unless it changed recently, a table was split over 
multiple files at the 1Gb mark.

Peter

-- 
Peter Mount [EMAIL PROTECTED]
PostgreSQL JDBC Driver: http://www.retep.org.uk/postgres/
RetepPDF PDF library for Java: http://www.retep.org.uk/pdf/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Toast, Text, blob bytea Huh?

2001-08-24 Thread Tom Lane

Peter T Mount [EMAIL PROTECTED] writes:
 Quoting Joe Conway [EMAIL PROTECTED]:
 indicating that the practical limit is closer to 1 GB -- not sure why).

 It may be something to do with the 1Gb splitting of the physical files 
 representing a table...

No, that's just a coincidence.  The reason that TOAST limits fields to
1Gb is that the high-order two bits of the varlena length word were
commandeered as TOAST state indicators.  There are now only thirty bits
available to represent the length of a variable-length Datum; hence the
hard limit on field width is 1Gb.

I'd think that the practical limit is quite a bit less than that, at
least until we devise an API that lets you read and write toasted values
in sections.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Toast, Text, blob bytea Huh?

2001-08-24 Thread Bruce Momjian

 No, that's just a coincidence.  The reason that TOAST limits fields to
 1Gb is that the high-order two bits of the varlena length word were
 commandeered as TOAST state indicators.  There are now only thirty bits
 available to represent the length of a variable-length Datum; hence the
 hard limit on field width is 1Gb.
 
 I'd think that the practical limit is quite a bit less than that, at
 least until we devise an API that lets you read and write toasted values
 in sections.

Yes, passing around multi-gigabytes memory chunks in a process is pretty
slow.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Toast, Text, blob bytea Huh?

2001-08-24 Thread Jan Wieck

Peter T Mount wrote:
 Quoting Joe Conway [EMAIL PROTECTED]:

  TEXT is a datatype which stores character data of unspecified length (up
  to
  the max value of a 4 byte integer in length, although I've seen
  comments
  indicating that the practical limit is closer to 1 GB -- not sure why).

 It may be something to do with the 1Gb splitting of the physical files
 representing a table... Unless it changed recently, a table was split over
 multiple files at the 1Gb mark.

No,  it's  because  the  upper  two bits of the variable size
field are used as flags.

But in practice there are other limits that force you to keep
the objects you throw into text or bytea fields alot smaller.
When your INSERT query is received,  parsed,  planned  and  a
heap  tuple  created,  there are at least four copies of that
object in the backends memory. How much virtual  memory  does
your OS support for one single process?

And  by  the  way,  TOAST is not only used for character data
types.  All variable size data types in the base  system  are
toastable. Well, arrays might be considered sort of pop-tarts
here, but anyway, they get toasted.


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] Toast, Text, blob bytea Huh?

2001-08-23 Thread jason . ory

I'm trying my best to convert from MySQL to PgSQL but I cant get a good
clear answer about 
certian issures.Mainly  TEXT, TOAST,BLOB , BYTEA etc.
It was an easy task in mysql but everything in the archives about , text ,
toast and bytea is just
confusing me with postgresql. I have Bruces's book and I've searched the
archives years back with all the right keywords with not luck.Here is my
situation--


WHAT I WAS DOING IN MYSQL
Via the web my clients are uploading basic  text/data files, sometimes 
than 30MB. In the past ,via CGI I have been parsing the file
into one STL string, using mysql_escape_string to escape it and then using
an INSERT  to place the 
   ,\'+stlstring+\' ,into a BLOB column. 
dont want to use a temp. file or files in general anywhere. The data will
always be passed via the database and buffers for certian reasons.Thus no
OID's


THIS IS WHAT I CANT SEEM TO FIGURE OUT IN POSTGRESQL
1. I cant get a clear answer on what kind of data type to use for my large
text string?  TEXT, ???, ??? or something about TOAST
I have seen in the e-mail archive but cant find any documentaion?

2. I've written my own escape method ,(cant find one for Pgsql) , BUT i
don't know what 
to escape and not to escape. So it keeps failing. I cand find any docs. on
what to escape either?


SUMMARY
What is the best datatype to use, for large raw text and/or  binary if i
choose? 
Once I know this,how Im a supposed to escape my string and get it through
the parser correctly so i can retrieve it correctly?

Thanks for your time.

PS: Using RedHat.



Jason H. Ory
Medprint+
Software Developer
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Toast, Text, blob bytea Huh?

2001-08-23 Thread Joe Conway

 I'm trying my best to convert from MySQL to PgSQL but I cant get a good
 clear answer about
 certian issures.Mainly  TEXT, TOAST,BLOB , BYTEA etc.
 It was an easy task in mysql but everything in the archives about , text ,
 toast and bytea is just
 confusing me with postgresql. I have Bruces's book and I've searched the

Quick glossary:

TEXT is a datatype which stores character data of unspecified length (up to
the max value of a 4 byte integer in length, although I've seen comments
indicating that the practical limit is closer to 1 GB -- not sure why). TEXT
is not intended to hold arbitrary binary data. If you want to store binary
in a text column, encode it to hex or base64 or something first.

TOAST is an internal database concept. If I understand it correctly, it
refers to a combination of compression and out-of-line storage for large
length values of a charater datatype. This happens transparently to you.

BLOB is otherwise known as LO or Large Object datatype in PostgreSQL. These
are always stored out-of-line, I don't believe they are compressed, and they
have their own special access methods (for dealing with data a chunk at a
time).

BYTEA is very similar to TEXT, except that it is intended for binary data. I
recently posted a PHP function which escapes binary in order to allow
inserting it into a bytea column (on the pgsql-general list). At a minimum
there are 4 characters which must be escaped. They are ACSII 0, 10, 39, and
92. They must reach PostgreSQL looking like \\000, \\012, \\047, and \\134
respectively (actually 39 could be \' and 92 could be , but it is
simpler to be consistent).

 THIS IS WHAT I CANT SEEM TO FIGURE OUT IN POSTGRESQL
 1. I cant get a clear answer on what kind of data type to use for my large
 text string?  TEXT, ???, ??? or something about TOAST
 I have seen in the e-mail archive but cant find any documentaion?

So, you can use TEXT if you encode to hex or base64 in your app first, or
you bytea if you escape as I described above in your app. Or you can use the
LO functions to manipulate large objects (see
http://www.postgresql.org/idocs/index.php?lo-interfaces.html).


 2. I've written my own escape method ,(cant find one for Pgsql) , BUT i
 don't know what
 to escape and not to escape. So it keeps failing. I cand find any docs. on
 what to escape either?

See above.



 SUMMARY
 What is the best datatype to use, for large raw text and/or  binary if i
 choose?
 Once I know this,how Im a supposed to escape my string and get it through
 the parser correctly so i can retrieve it correctly?

If you use TEXT, you will have to decode the hex/base64 back into binary in
your app. Similarly, if you use bytea, although stored as binary, the data
is returned with unprintable values escaped as octals*. Your app will have
to decode the octals back into binary.

*NOTE to hackers: is there a good reason for this? ISTM that the client
should be responsible for any encoding needed when bytea is returned. The
server should return bytea as straight varlena.

If you use LO, you have to use the interface functions instead of standard
SQL.

Hope this helps,

-- Joe


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly