Re: Are there use cases for storing null bytes in CharField/TextField?

2017-06-02 Thread Tim Graham
I found a PostgreSQL bug report requesting removal of the restriction. 
Here's the final reply:

Franklin Schmidt wrote:
> I agree that storing 0x00 in a UTF8 string is weird, but I am
> converting a huge database to postgres, and in a huge database, weird
> things happen.  Using bytea for a text field just because one in a
> million records has a 0x00 doesn't make sense to me.  I did hack
> around it in my conversion code to remove the 0x00 but I expect that
> anyone else who tries converting a big database to postgres will also
> confront this issue.

That's the right solution. If you have 0x00 bytes in your text fields, 
you're much better off cleaning them away anyway, than trying to work 
around them.

-Heikki Linnakangas


https://www.postgresql.org/message-id/200712170734.lBH7YdG9034458%40wwwmaster.postgresql.org

I also found a possible related discussion about supporting \u in JSON 
values [1]. PostgreSQL tried to support it but had to remove that support 
because it caused ambiguity.

https://www.postgresql.org/message-id/e1yhhv8-00032a...@gemulon.postgresql.org

On Wednesday, May 31, 2017 at 8:27:49 PM UTC-4, Jon Dufresne wrote:
>
> On Mon, May 15, 2017 at 10:30 AM, Tim Chase  > wrote:
>
>> On 2017-05-15 08:54, Tim Graham wrote:
>> > Does anyone know of a use case for using null bytes in
>> > CharField/TextField?
>>
>> Is this not what BinaryField is for?  It would seem to me that
>> attempting to store binary NULL bytes in a CharField/TextField should
>> result in an error condition.
>>
>
> The null byte is also a valid Unicode code point [0].
>
> I guess I'm a bit surprised that a valid code point can't be stored in a 
> PostgreSQL text column. This does appear to be documented for the char(int) 
> string function [1], although without justification.
>
> > The NULL (0) character is not allowed because text data types cannot 
> store such bytes.
>
> I'm curious behind PostgreSQL's decision to prohibit this code point. If 
> anyone has additional information to share on their reason, please pass it 
> along.
>
>
> [0] http://www.fileformat.info/info/unicode/char//index.htm
> [1] https://www.postgresql.org/docs/current/static/functions-string.html
>
> Cheers
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/450da309-c2af-43a9-8026-51d23994f8c9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Are there use cases for storing null bytes in CharField/TextField?

2017-05-31 Thread Jon Dufresne
On Mon, May 15, 2017 at 10:30 AM, Tim Chase 
wrote:

> On 2017-05-15 08:54, Tim Graham wrote:
> > Does anyone know of a use case for using null bytes in
> > CharField/TextField?
>
> Is this not what BinaryField is for?  It would seem to me that
> attempting to store binary NULL bytes in a CharField/TextField should
> result in an error condition.
>

The null byte is also a valid Unicode code point [0].

I guess I'm a bit surprised that a valid code point can't be stored in a
PostgreSQL text column. This does appear to be documented for the char(int)
string function [1], although without justification.

> The NULL (0) character is not allowed because text data types cannot
store such bytes.

I'm curious behind PostgreSQL's decision to prohibit this code point. If
anyone has additional information to share on their reason, please pass it
along.


[0] http://www.fileformat.info/info/unicode/char//index.htm
[1] https://www.postgresql.org/docs/current/static/functions-string.html

Cheers

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CADhq2b6egRaJ8t23HRXwjfqTHjTJx%3DjRPSL8P%3D6MUTyotFsP6w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Are there use cases for storing null bytes in CharField/TextField?

2017-05-19 Thread Tim Graham
If CharField/TextField have a form validation error if null bytes are in 
the input, are users going to be able to understand that error and fix it? 
I'm not sure if it's a probable case, but I'm thinking of a non-technical 
user who copy/pastes some text that includes a null byte.

Perhaps a " strip_null_bytes" model field option that defaults to True 
would be reasonable. That could be passed to the form field to toggle where 
or not that validation happens. Actually, three possible behaviors might be 
needed: silently strip null bytes, allow null bytes (an invalid option when 
using PostgreSQL), prohibit null bytes.

On Tuesday, May 16, 2017 at 5:11:38 AM UTC-4, Jani Tiainen wrote:
>
> Hi,
>
> I would guess that one could use null byte to denote "empty field" in 
> Oracle for example. (I recall seeing such a convention in one of our 
> non-django apps). And that's to overcome limitation that Oracle doesn't 
> have real concept of empty string so we stored single null byte to mark 
> that. 
>
>
> On 15.05.2017 18:54, Tim Graham wrote:
>
> Does anyone know of a use case for using null bytes in CharField/TextField?
>
> psycopg2 2.7+ raises ValueError("A string literal cannot contain NUL 
> (0x00) characters.") when trying to save null bytes [0] and this 
> exception is unhandled in Django which allow malicious form submissions to 
> crash [1]. With psycopg2 < 2.7, there is no exception and null bytes are 
> silently truncated by PostgreSQL. Other databases that I tested (SQLite, 
> MySQL, Oracle) allow saving null bytes. This creates possible 
> cross-database compatibility problems when moving data from those databases 
> to PostgreSQL, e.g.[2].
>
> I propose to have CharField and TextField strip null bytes from the value 
> either a) only on PostgreSQL or b) on all databases. Please indicate your 
> preference or suggest another solution.
>
> [0] https://github.com/psycopg/psycopg2/issues/420
> [1] https://code.djangoproject.com/ticket/28201 - Saving a Char/TextField 
> with psycopg2 2.7+ raises ValueError: A string literal cannot contain NUL 
> (0x00) characters is unhandled
> [2] https://code.djangoproject.com/ticket/28117 - loaddata raises 
> ValueError with psycopg2 backend when data contains null bytes
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-develop...@googlegroups.com .
> To post to this group, send email to django-d...@googlegroups.com 
> .
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/9897126d-b6ef-48f1-9f19-96ed98ce10e5%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
> -- 
> Jani Tiainen
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7226ab56-ff69-415c-9955-82e8d62cdd60%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Are there use cases for storing null bytes in CharField/TextField?

2017-05-16 Thread Jani Tiainen

Hi,

I would guess that one could use null byte to denote "empty field" in 
Oracle for example. (I recall seeing such a convention in one of our 
non-django apps). And that's to overcome limitation that Oracle doesn't 
have real concept of empty string so we stored single null byte to mark 
that.




On 15.05.2017 18:54, Tim Graham wrote:
Does anyone know of a use case for using null bytes in 
CharField/TextField?


psycopg2 2.7+ raises ValueError("A string literal cannot contain NUL 
(0x00) characters.") when trying to save null bytes [0] and this 
exception is unhandled in Django which allow malicious form 
submissions to crash [1]. With psycopg2 < 2.7, there is no exception 
and null bytes are silently truncated by PostgreSQL. Other databases 
that I tested (SQLite, MySQL, Oracle) allow saving null bytes. This 
creates possible cross-database compatibility problems when moving 
data from those databases to PostgreSQL, e.g.[2].


I propose to have CharField and TextField strip null bytes from the 
value either a) only on PostgreSQL or b) on all databases. Please 
indicate your preference or suggest another solution.


[0] https://github.com/psycopg/psycopg2/issues/420
[1] https://code.djangoproject.com/ticket/28201 - Saving a 
Char/TextField with psycopg2 2.7+ raises ValueError: A string literal 
cannot contain NUL (0x00) characters is unhandled
[2] https://code.djangoproject.com/ticket/28117 - loaddata raises 
ValueError with psycopg2 backend when data contains null bytes

--
You received this message because you are subscribed to the Google 
Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to django-developers+unsubscr...@googlegroups.com 
.
To post to this group, send email to 
django-developers@googlegroups.com 
.

Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/9897126d-b6ef-48f1-9f19-96ed98ce10e5%40googlegroups.com 
.

For more options, visit https://groups.google.com/d/optout.


--
Jani Tiainen

--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/22fac845-6870-de4e-6fbe-eab247b8853a%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Are there use cases for storing null bytes in CharField/TextField?

2017-05-15 Thread Claude Paroz
I also think that this should be handled at serialization level (form 
fields and (de)serialization framework).

Claude

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/a9dde655-4bb1-409f-883e-6f47f742f17a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Are there use cases for storing null bytes in CharField/TextField?

2017-05-15 Thread Tim Chase
On 2017-05-15 08:54, Tim Graham wrote:
> Does anyone know of a use case for using null bytes in
> CharField/TextField?

Is this not what BinaryField is for?  It would seem to me that
attempting to store binary NULL bytes in a CharField/TextField should
result in an error condition.

-tkc



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/20170515123050.363a2859%40bigbox.christie.dr.
For more options, visit https://groups.google.com/d/optout.


Re: Are there use cases for storing null bytes in CharField/TextField?

2017-05-15 Thread Luke Plant
I agree with Adam, we should never silently change submitted data at the 
model layer. My preference would be c), a form-level validation error 
that prevents saving.


Luke


On 15/05/17 19:11, Adam Johnson wrote:
The problem with (a) - data with null bytes in strings from other 
databases can't be loaded into PG as per #28117 .


The problem with (b) - data currently in databases in the wild will be 
modified upon save 


(b) is incredibly destructive and could break an unknown number of 
applications whilst (a) doesn't affect anyone until they try to 
migrate null-byte-strings into PG. I vote for (a), or (c) add 
form-level validation to (Char/Text)Field that null bytes aren't in 
the submitted string (for all databases) and error when trying to save 
them on PG.



On 15 May 2017 at 16:54, Tim Graham > wrote:


Does anyone know of a use case for using null bytes in
CharField/TextField?

psycopg2 2.7+ raises ValueError("A string literal cannot contain
NUL (0x00) characters.") when trying to save null bytes [0] and
this exception is unhandled in Django which allow malicious form
submissions to crash [1]. With psycopg2 < 2.7, there is no
exception and null bytes are silently truncated by PostgreSQL.
Other databases that I tested (SQLite, MySQL, Oracle) allow saving
null bytes. This creates possible cross-database compatibility
problems when moving data from those databases to PostgreSQL, e.g.[2].

I propose to have CharField and TextField strip null bytes from
the value either a) only on PostgreSQL or b) on all databases.
Please indicate your preference or suggest another solution.

[0] https://github.com/psycopg/psycopg2/issues/420

[1] https://code.djangoproject.com/ticket/28201
 - Saving a
Char/TextField with psycopg2 2.7+ raises ValueError: A string
literal cannot contain NUL (0x00) characters is unhandled
[2] https://code.djangoproject.com/ticket/28117
 - loaddata raises
ValueError with psycopg2 backend when data contains null bytes
-- 
You received this message because you are subscribed to the Google

Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to django-developers+unsubscr...@googlegroups.com
.
To post to this group, send email to
django-developers@googlegroups.com
.
Visit this group at
https://groups.google.com/group/django-developers
.
To view this discussion on the web visit

https://groups.google.com/d/msgid/django-developers/9897126d-b6ef-48f1-9f19-96ed98ce10e5%40googlegroups.com

.
For more options, visit https://groups.google.com/d/optout
.




--
Adam
--
You received this message because you are subscribed to the Google 
Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to django-developers+unsubscr...@googlegroups.com 
.
To post to this group, send email to 
django-developers@googlegroups.com 
.

Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMyDDM1qVc3ovXb9PhzKY3jd__FURYX6Fy9r1WFrBpcpMy%2Bz%2BA%40mail.gmail.com 
.

For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1fbe9f18-f935-93eb-dd90-ffa754ad9c2b%40cantab.net.
For more options, visit https://groups.google.com/d/optout.


Re: Are there use cases for storing null bytes in CharField/TextField?

2017-05-15 Thread Michael Manfre
I imagine we won't hear of a use case until after the change happens and
I'm some what strongly opposed to stripping potentially valid data from all
databases because of a limitation of one. I'd be in favor of loaddata
checking for null bytes and complaining when the backend doesn't support
that feature.

Regards,
Michael Manfre

On Mon, May 15, 2017 at 11:54 AM Tim Graham  wrote:

> Does anyone know of a use case for using null bytes in CharField/TextField?
>
> psycopg2 2.7+ raises ValueError("A string literal cannot contain NUL
> (0x00) characters.") when trying to save null bytes [0] and this
> exception is unhandled in Django which allow malicious form submissions to
> crash [1]. With psycopg2 < 2.7, there is no exception and null bytes are
> silently truncated by PostgreSQL. Other databases that I tested (SQLite,
> MySQL, Oracle) allow saving null bytes. This creates possible
> cross-database compatibility problems when moving data from those databases
> to PostgreSQL, e.g.[2].
>
> I propose to have CharField and TextField strip null bytes from the value
> either a) only on PostgreSQL or b) on all databases. Please indicate your
> preference or suggest another solution.
>
> [0] https://github.com/psycopg/psycopg2/issues/420
> [1] https://code.djangoproject.com/ticket/28201 - Saving a Char/TextField
> with psycopg2 2.7+ raises ValueError: A string literal cannot contain NUL
> (0x00) characters is unhandled
> [2] https://code.djangoproject.com/ticket/28117 - loaddata raises
> ValueError with psycopg2 backend when data contains null bytes
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/9897126d-b6ef-48f1-9f19-96ed98ce10e5%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAGdCwBsvwtr4F3j1jGo9uGTwBsjvU0ypLc%2B2q0482Peha3ejzw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Are there use cases for storing null bytes in CharField/TextField?

2017-05-15 Thread Adam Johnson
The problem with (a) - data with null bytes in strings from other databases
can't be loaded into PG as per #28117 .

The problem with (b) - data currently in databases in the wild will be
modified upon save 

(b) is incredibly destructive and could break an unknown number of
applications whilst (a) doesn't affect anyone until they try to migrate
null-byte-strings into PG. I vote for (a), or (c) add form-level validation
to (Char/Text)Field that null bytes aren't in the submitted string (for all
databases) and error when trying to save them on PG.


On 15 May 2017 at 16:54, Tim Graham  wrote:

> Does anyone know of a use case for using null bytes in CharField/TextField?
>
> psycopg2 2.7+ raises ValueError("A string literal cannot contain NUL
> (0x00) characters.") when trying to save null bytes [0] and this
> exception is unhandled in Django which allow malicious form submissions to
> crash [1]. With psycopg2 < 2.7, there is no exception and null bytes are
> silently truncated by PostgreSQL. Other databases that I tested (SQLite,
> MySQL, Oracle) allow saving null bytes. This creates possible
> cross-database compatibility problems when moving data from those databases
> to PostgreSQL, e.g.[2].
>
> I propose to have CharField and TextField strip null bytes from the value
> either a) only on PostgreSQL or b) on all databases. Please indicate your
> preference or suggest another solution.
>
> [0] https://github.com/psycopg/psycopg2/issues/420
> [1] https://code.djangoproject.com/ticket/28201 - Saving a Char/TextField
> with psycopg2 2.7+ raises ValueError: A string literal cannot contain NUL
> (0x00) characters is unhandled
> [2] https://code.djangoproject.com/ticket/28117 - loaddata raises
> ValueError with psycopg2 backend when data contains null bytes
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/django-developers/9897126d-b6ef-48f1-9f19-
> 96ed98ce10e5%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adam

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMyDDM1qVc3ovXb9PhzKY3jd__FURYX6Fy9r1WFrBpcpMy%2Bz%2BA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Are there use cases for storing null bytes in CharField/TextField?

2017-05-15 Thread Tim Graham
Does anyone know of a use case for using null bytes in CharField/TextField?

psycopg2 2.7+ raises ValueError("A string literal cannot contain NUL (0x00) 
characters.") when trying to save null bytes [0] and this exception is 
unhandled in Django which allow malicious form submissions to crash [1]. With 
psycopg2 < 2.7, there is no exception and null bytes are silently truncated 
by PostgreSQL. Other databases that I tested (SQLite, MySQL, Oracle) allow 
saving null bytes. This creates possible cross-database compatibility 
problems when moving data from those databases to PostgreSQL, e.g.[2].

I propose to have CharField and TextField strip null bytes from the value 
either a) only on PostgreSQL or b) on all databases. Please indicate your 
preference or suggest another solution.

[0] https://github.com/psycopg/psycopg2/issues/420
[1] https://code.djangoproject.com/ticket/28201 - Saving a Char/TextField 
with psycopg2 2.7+ raises ValueError: A string literal cannot contain NUL 
(0x00) characters is unhandled
[2] https://code.djangoproject.com/ticket/28117 - loaddata raises 
ValueError with psycopg2 backend when data contains null bytes

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/9897126d-b6ef-48f1-9f19-96ed98ce10e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.