from:"Cristiano Coelho"

Transaction management and atomic

2014-09-30 Thread Cristiano Coelho

Hello there,

I have recently updated to django 1.7, from 1.5 and one of the main changes 
I had to do was to replace all deprecated transaction.commit_manually to 
transaction.atomic, so far so good.

After this, I have found out that if an IntegrityError or DatabaseError 
exception happen inside a code that is decorated with @transaction.atomic 
(or inside a with transaction.atomic) and the exception is handled (not 
throwing it out from the atomic block) the whole transaction gets invalid, 
and any other database access will fail, just as described in the docs 
(silly me I didn't read them with enough depth).

As mentioned in the docs, the solution is to also put the "exception 
throwing code" inside another atomic block, and catch the exception outside 
of it. I can not describe how anoying this is, compared to the old 
behaviour where I could easly decide when to commit or rollback, now I have 
to review my whole code to detect the places where a database save is 
performed and the exception is handled, and add another atomic block to it.

I believe this was an issue heavily discussed with the develpers and they 
came to this as the best option, but there needs to be another easier way 
to handle this kind of issue.
What are the complications of leaving the transaction in a correct state 
even if an operation raises a database error and the exception is handled 
silently (not thrown outside of the atomic block)? This was totally 
possible with the deprecated transaction functions, where you could do all 
your logic and just at the end handle the transaction commit or rollback, 
and didn't matter what happened inside of it. But now this is imposible, 
you need to keep a sharp eye on every database save you perform and 
surround it with another atomic block just in case it raises a database 
error exception.


To make the issue clear, here's a sample code:

First one shows what would my current code look like, where add_children() 
would raise an exception because generate_relationships() was not inside 
another atomic block.
With the old transactions api, I could easily surround the whole code in 
another try/except, and at the very end commit or rollback, and everything 
would be fine even if generate_relationships() throws an exception, it 
would be siltently ignored.

from django.db import IntegrityError, transaction
@transaction.atomicdef viewfunc(request):
create_parent()

try:
generate_relationships()
except IntegrityError:
handle_exception()

add_children()



Now this is how the code should actually be with current django 1.7 in 
order to prevent an error and get the excepted behaviour.

from django.db import IntegrityError, transaction
@transaction.atomicdef viewfunc(request):
create_parent()

try:
with transaction.atomic():
generate_relationships()
except IntegrityError:
handle_exception()

add_children()


It would be great, if the transaction api could work as the first code, 
with the results of the second one. Meaning, even if 
generate_relationships() raises an exception, and it is handled correctly, 
the transaction would still be valid to be used.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/e347fe6d-78c8-4c15-848d-3a82415c3550%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Feature Request] Performant values_list query

2015-11-16 Thread Cristiano Coelho

I would like to add, that minor change on a critical query meant an increase
from 37 to 47 requests per second on a benchmark test with apache, that's a 22%
improvement that will be bigger on larger results (this was just with 700 rows
result set), compared to using some pagination with a limit of 100 that raises
the requests per second value to almost 200, there's a clear bottleneck in
there.

Looking at the code of values list queryset, maybe is it possible to replace
the call of results_iter to directly use execute_sql(MULTI) avoiding all the
unnecessary ifs and loops that results_iter does that is mainly useful for
other query set objects, but not for values list. I did a test with the above
and values list queries were performing as good as using directly a django
cursor, but not as good as using as_sql() and then the cursor, but very close.

What would be the issues of adding such and optimization to values list? I'm
even thinking of manually monkey patching it so all my values list queries are
improved in my project, since I use values list most of the time due to
performance reasons since building objects on large results is extremely slow,
blame on python sadly.

--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/7729d524-1449-48a4-92b2-72b866c19e7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Feature Request] Performant values_list query

2015-11-16 Thread Cristiano Coelho

Hi, thanks for the response! I have never developed nor ran the django test 
suite, I can certainly try as you mentioned, I was hoping for anyone that 
actually implemented values_list to give me a solid reason to not do any change 
as I'm probably wrong and the current way it is implemented is the fastest 
approach.
I guess I can play with the tests for a while. However I believe the tests will 
need to be ran against all db backends? Installing all databases will be a 
little complicated.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3df3b7b1-1120-49e8-8eb6-46a9486a77a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Feature Request] Performant values_list query

2015-11-16 Thread Cristiano Coelho

Interesting, maybe because I'm using MySQL with mysqlclient connector, but 
running the straight query with the django cursor wrapper always returns the 
correct data types, even dates with its it time zone when time zone is enabled, 
was it all coincidence? Would using a different backend break with a cursor 
query returning invalid/not expected data? Seems like a lot to test

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d7909981-f2e6-4b72-98b3-045734e68be4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Feature Request] Performant values_list query

2015-11-16 Thread Cristiano Coelho

Hi, I have downloaded the actual source code, and I have probably forgot to 
mention that I'm using django 1.7.10.
It seems like the compiler.py module got a little bit improved since then, 
what used to be a big and highly inefficient loop with many if conditions 
inside was reduced to a small loop with a single if (it could be improved 
even more, taking the if out of the loop and doing two separate loops based 
on the if condition, avoiding checking the same if condition over and over 
when you already know its value, which is very slow on python since there's 
no JIT optimizing this kind of loops.).
On another side, it also seems like ValuesListQuerySet was changed to 
ValuesListIterable, but the functionallity remains the same.
So I should probably do the tests with this version to see if the call to 
results_iter is still a big deal for a values_list query.

I really wish that values_list (and values?) would be used as a high 
performance query option with optimizations in mind, rather than just the 
implicit improvements when returning a list of tuples rather than model 
instances which are very expensive for large queries.


El domingo, 15 de noviembre de 2015, 23:10:08 (UTC-3), Cristiano Coelho 
escribió:
>
> After some testing I have found out that even when using values_list to 
> prevent massive object creation when fetching data, it is from 2 to 3 times 
> slower tan using directly a cursor.execute query through the django cursor.
> The issue started here 
> http://stackoverflow.com/questions/33726004/django-queryset-vs-raw-query-performance
>  
> when trying to improve some queries that were looking very slow on apache 
> benchemark testing.
> It seems like compiler.results_iter, called from ValuesListQuerySet is 
> very, very slow, due to all the for loops in Python code compared to when 
> using a raw query through a C connector (like mysqlclient), there's just 
> too much boilerplate that might be posible to be removed.
> As an example, a very ugly work around to those critical queries, was 
> defining a function like this, which would convery a queryset into 
> something usable by a cursor, which performs very, very efficiently, at 
> least with mysqlclient connector.
>
> q = qs.query.get_compiler(qs.db).as_sql()
> with connection.cursor() as c:
> c.execute(q[0], q[1])
> for r in c:
> yield r
>
> Would it be possible to have something similar to values_list, but that 
> executes directly through a cursor, improving performance? I'm sure it will 
> be less flexible than values_list, but the extra performance will be nice.
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/2b0a8832-b81f-4591-a5c3-75569383c26d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Feature Request] Performant values_list query

2015-11-16 Thread Cristiano Coelho

You beat me to it. Yes I have just tested under django 1.8.6, and the issue 
I started this with is gone, both values_list and a direct raw query 
perform as good, so this is definetly an issue only on django 1.7.10 or 
less.
I can not test django 1.9 since my project is not really compatible with 
it, I have some backwards compatibility issues from 1.7.10 related to app 
loading that I was very lazy to fix so I stood with 1.7, but it seems like 
this will be time to update, assuming 1.9 will be as good as 1.8.

Sorry about all the fuzz, I guess I should have tested all this directly to 
1.8 before making any post, but I apreciate the fast responses!
Perhaps it would be a good idea to warn about values_list bottleneck on 1.7!


El lunes, 16 de noviembre de 2015, 21:30:35 (UTC-3), Josh Smeaton escribió:
>
> The version of Django you use is going to have a large (code) impact on 
> what is actually happening when calling values_list. The 
> Values[List]QuerySet classes are gone in 1.9. 1.8 implemented a 
> different/better way of converting values from the database to python. 
> from_db_value came about in 1.8 I think which should fast path a lot of 
> conversions. The stackoverflow post you linked to mentions Django 1.7. Can 
> you run exactly the same tests using Django Master/1.9 and report back your 
> findings? 
>
> I don't doubt there's room for performance improvements if you go looking. 
> As Anssi said, we'd definitely welcome improvements to performance where 
> they can be found. But you should make sure the kinds of changes you want 
> to make will have an impact when using the latest version of Django, 
> because some of the low hanging fruit may have already been patched.
>
> Cheers
>
> On Tuesday, 17 November 2015 04:12:11 UTC+11, Cristiano Coelho wrote:
>>
>> Interesting, maybe because I'm using MySQL with mysqlclient connector, 
>> but running the straight query with the django cursor wrapper always 
>> returns the correct data types, even dates with its it time zone when time 
>> zone is enabled, was it all coincidence? Would using a different backend 
>> break with a cursor query returning invalid/not expected data? Seems like a 
>> lot to test
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7e54bcff-c0ea-4614-a079-7a4be21249eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Question] Many-To-Many query where only pk is returned

2015-11-18 Thread Cristiano Coelho

Hello there,

Lets say I have these two models (sorry about the spanish names!) ( Django 
1.8.6 and MySQL backend )

class Especialidad(models.Model):
nombre = models.CharField(max_length=250, blank=False, unique=True)



class Usuario(AbstractBaseUser): 
permisosEspecialidad = models.ManyToManyField("Especialidad", blank=True)

Let u be some Usuario instance, and the following query:

u.permisosEspecialidad.all().values_list('pk',flat=True)

The actual printed query is:

SELECT `samiBackend_especialidad`.`id`
FROM `samiBackend_especialidad` 
INNER JOIN `samiBackend_usuario_permisosEspecialidad` ON ( 
`samiBackend_especialidad`.`id` = 
`samiBackend_usuario_permisosEspecialidad`.`especialidad_id` ) 
WHERE `samiBackend_usuario_permisosEspecialidad`.`usuario_id` = 8

As my understanding, since I'm only selecting the id field which is already 
present in the intermediary table (and is also a FK), the actual join is 
redundant, as I have all the info I need in this case.

So the query could work like this

SELECT `samiBackend_usuario_permisosEspecialidad`.`especialidad_id`
FROM  `samiBackend_usuario_permisosEspecialidad`
WHERE `samiBackend_usuario_permisosEspecialidad`.`usuario_id` = 8


I guess this works this way because this particular case might be hard to 
detect or won't be compatible with any additional query building, however, for 
ForeignKey relations, this optimization is already done (If you select the 
primary key from the selected model only, it wont add a join)

What would be the complications to implement this? Would it worth the effort?


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d31fdc22-8105-4be3-8c99-bc01279c4e5c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] Many-To-Many query where only pk is returned

2015-11-19 Thread Cristiano Coelho

I think you didn't understand the query. You would filter by the 
intermediate table directly, and filter by Usuario (as you want all 
Especialidad from one Usuario), yielding 2 results as it should with a 
single select from one table. Right now django translates it into a more 
complicated select with a join from one of the tables with the intermediary 
one, which should be needed only if you are getting data from the main 
table.

El jueves, 19 de noviembre de 2015, 0:04:31 (UTC-3), Josh Smeaton escribió:
>
> It might be a bit early in the day for me, but isn't that query already 
> optimised? That is, it's already eliminated a join. It hasn't joined to the 
> "Especialidad" table, it's only joined to the intermediate table. I *think* 
> the join to the intermediate table is necessary because there could be 
> duplicates.
>
> Given the tables:
>
> Usuario(pk):
> 1
> 2
>
> Intermediate(usurario_id, especialidad_id):
> 1, 1
> 1, 2
>
> Especialidad(pk)
> 1
> 2
>
> Joining Usuario to Intermediate will return 4 results in SQL (2 for each 
> pk on Usuario) unless there was a distinct in there somewhere. I haven't 
> tested, so I'm not sure if django does duplicate elimination, but I'm 
> pretty sure it doesn't.
>
> Does this look right to you, or am I missing something?
>
> Cheers
>
>
> On Thursday, 19 November 2015 11:41:22 UTC+11, Cristiano Coelho wrote:
>>
>> Hello there,
>>
>> Lets say I have these two models (sorry about the spanish names!) ( 
>> Django 1.8.6 and MySQL backend )
>>
>> class Especialidad(models.Model):
>> nombre = models.CharField(max_length=250, blank=False, unique=True)
>>
>>
>>
>> class Usuario(AbstractBaseUser): 
>> permisosEspecialidad = models.ManyToManyField("Especialidad", blank=True)
>>
>> Let u be some Usuario instance, and the following query:
>>
>> u.permisosEspecialidad.all().values_list('pk',flat=True)
>>
>> The actual printed query is:
>>
>> SELECT `samiBackend_especialidad`.`id`
>> FROM `samiBackend_especialidad` 
>>  INNER JOIN `samiBackend_usuario_permisosEspecialidad` ON ( 
>> `samiBackend_especialidad`.`id` = 
>> `samiBackend_usuario_permisosEspecialidad`.`especialidad_id` ) 
>> WHERE `samiBackend_usuario_permisosEspecialidad`.`usuario_id` = 8
>>
>> As my understanding, since I'm only selecting the id field which is already 
>> present in the intermediary table (and is also a FK), the actual join is 
>> redundant, as I have all the info I need in this case.
>>
>> So the query could work like this
>>
>> SELECT `samiBackend_usuario_permisosEspecialidad`.`especialidad_id`
>> FROM  `samiBackend_usuario_permisosEspecialidad`
>> WHERE `samiBackend_usuario_permisosEspecialidad`.`usuario_id` = 8
>>
>>
>> I guess this works this way because this particular case might be hard to 
>> detect or won't be compatible with any additional query building, however, 
>> for ForeignKey relations, this optimization is already done (If you select 
>> the primary key from the selected model only, it wont add a join)
>>
>> What would be the complications to implement this? Would it worth the effort?
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3973dae9-e20e-4b2d-bac0-5c22886d1722%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] Many-To-Many query where only pk is returned

2015-11-19 Thread Cristiano Coelho

You are right. I believe an optimization like this would probably help just 
a few people, as only fetching data from the intermediary table is a rare 
thing. But if it is an easy change, which improves performance, why not? 
However I think the change is quite complicated.

El jueves, 19 de noviembre de 2015, 11:51:51 (UTC-3), charettes escribió:
>
> Hi Cristiano,
>
> If I get it correctly you'd like m2m querying to start with the 
> intermediary (FROM) table and JOIN the referenced one only if more fields 
> than the primary key are selected.
>
> class Book(models.Model):
> name = models.CharField(max_length=100)
>
> class Author(models.Model):
> books = models.ManyToMany(Book)
>
> author = Author.objects.get(pk=1)
> author.books.values_list('pk')
>
> Would result in the following query:
> SELECT book_id FROM author_books WHERE author_id = 1;
>
> Instead of:
> SELECT id FROM book JOIN author_books ON (book.id = author_books.book_id) 
> WHERE author_id = 1;
>
> I think this is a sensible optimization but I wonder about its 
> feasibility. It looks like the `pk` reference would require some special 
> handling to reference `book_id` since it's not actually a primary key on 
> the intermediate table.
>
> Simon
>
> Le mercredi 18 novembre 2015 19:41:22 UTC-5, Cristiano Coelho a écrit :
>>
>> Hello there,
>>
>> Lets say I have these two models (sorry about the spanish names!) ( 
>> Django 1.8.6 and MySQL backend )
>>
>> class Especialidad(models.Model):
>> nombre = models.CharField(max_length=250, blank=False, unique=True)
>>
>>
>>
>> class Usuario(AbstractBaseUser): 
>> permisosEspecialidad = models.ManyToManyField("Especialidad", blank=True)
>>
>> Let u be some Usuario instance, and the following query:
>>
>> u.permisosEspecialidad.all().values_list('pk',flat=True)
>>
>> The actual printed query is:
>>
>> SELECT `samiBackend_especialidad`.`id`
>> FROM `samiBackend_especialidad` 
>>  INNER JOIN `samiBackend_usuario_permisosEspecialidad` ON ( 
>> `samiBackend_especialidad`.`id` = 
>> `samiBackend_usuario_permisosEspecialidad`.`especialidad_id` ) 
>> WHERE `samiBackend_usuario_permisosEspecialidad`.`usuario_id` = 8
>>
>> As my understanding, since I'm only selecting the id field which is already 
>> present in the intermediary table (and is also a FK), the actual join is 
>> redundant, as I have all the info I need in this case.
>>
>> So the query could work like this
>>
>> SELECT `samiBackend_usuario_permisosEspecialidad`.`especialidad_id`
>> FROM  `samiBackend_usuario_permisosEspecialidad`
>> WHERE `samiBackend_usuario_permisosEspecialidad`.`usuario_id` = 8
>>
>>
>> I guess this works this way because this particular case might be hard to 
>> detect or won't be compatible with any additional query building, however, 
>> for ForeignKey relations, this optimization is already done (If you select 
>> the primary key from the selected model only, it wont add a join)
>>
>> What would be the complications to implement this? Would it worth the effort?
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f96c5d2f-3655-461f-8534-de8622cd284d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Question] MySQL Microseconds stripping

2015-12-18 Thread Cristiano Coelho

Hello,

After django 1.8, the mysql backend no longer strips microseconds. 
This is giving me some issues when upgrading from 1.7 (I actually upgraded 
to 1.9 directly), since date times are not stored with micro second 
precision on mysql, but the queries are sent with them.
As I see it, my only option is to update all existing date time columns of 
all existing tables, which is quite boring since there are many tables.
Is there a way I can explicitly set the model datetime precision? Will this 
work with raw queries also? Could this be a global setting or monkey patch? 
This new behaviour basically breaks any '=' query on date times, at least 
raw queries (I haven't tested the others) since it sends micro seconds 
which are not stripped down.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/c95328b3-5e22-471b-bbb0-30924eca8363%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] MySQL Microseconds stripping

2015-12-18 Thread Cristiano Coelho

Also I would like to add, that even if the mysql column is a 0 fractional 
datetime column, and you send a datetime with some fraction in it (which is 
what django does right now), it won't handle it correctly (ie trim the 
fraction since the actual column has no fraction) but instead just try to 
match the fractional date. This makes me think if this might be a bug with 
mysql...

El viernes, 18 de diciembre de 2015, 21:52:43 (UTC-3), Cristiano Coelho 
escribió:
>
> Hello,
>
> After django 1.8, the mysql backend no longer strips microseconds. 
> This is giving me some issues when upgrading from 1.7 (I actually upgraded 
> to 1.9 directly), since date times are not stored with micro second 
> precision on mysql, but the queries are sent with them.
> As I see it, my only option is to update all existing date time columns of 
> all existing tables, which is quite boring since there are many tables.
> Is there a way I can explicitly set the model datetime precision? Will 
> this work with raw queries also? Could this be a global setting or monkey 
> patch? This new behaviour basically breaks any '=' query on date times, at 
> least raw queries (I haven't tested the others) since it sends micro 
> seconds which are not stripped down.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/888f41c7-d2b8-4b86-b2cd-e5ff033f5cfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] MySQL Microseconds stripping

2015-12-18 Thread Cristiano Coelho

Erik,
I'm using MySQL 5.6.x and indeed it has microseconds support, but that's 
not the issue.

The issue is that every datetime column created has no microseconds (since 
they were created with django 1.7, so it is actually a datetime(0) column) 
and I would like to keep it that way, however, since django 1.8+ will 
always send microseconds in the query, inserts will go fine (mysql will 
just ignore them) but SELECTS will all fail since mysql will not strip 
microseconds from the where clause (even if the column is defined as 0 
datetime, duh, really mysql?), so basically everything that uses datetime 
equaility in the query stopped working.

As suggested, all datetime columns should be updated to datetime(6) so it 
works correctly with the new django behaviour, however, I would like to 
keep datetime columns as they are, since I don't need microseconds, so I'm 
wondering if there's any way to get back the old django behaviour for 
mysql, through a setting, or monkey patch (as long as it works for all 
models and raw queries).

El sábado, 19 de diciembre de 2015, 1:59:00 (UTC-3), Erik Cederstrand 
escribió:
>
>
> > Den 19. dec. 2015 kl. 07.52 skrev Cristiano Coelho <cristia...@gmail.com 
> >: 
> > 
> > Hello, 
> > 
> > After django 1.8, the mysql backend no longer strips microseconds. 
> > This is giving me some issues when upgrading from 1.7 (I actually 
> upgraded to 1.9 directly), since date times are not stored with micro 
> second precision on mysql, but the queries are sent with them. 
> > As I see it, my only option is to update all existing date time columns 
> of all existing tables, which is quite boring since there are many tables. 
> > Is there a way I can explicitly set the model datetime precision? Will 
> this work with raw queries also? Could this be a global setting or monkey 
> patch? This new behaviour basically breaks any '=' query on date times, at 
> least raw queries (I haven't tested the others) since it sends micro 
> seconds which are not stripped down. 
>
> MySQL as of version 5.6.4 (and MariaDB) is able to store microseconds in 
> datetime fields, but you need to set the precision when you create the 
> column. In Django, this should "just work". Which version of MySQL are you 
> using, and are your columns created as DATETIME(6)? 
>
> Erik

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/2229a467-19d6-4e2f-b510-d49eb6ccf405%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] MySQL Microseconds stripping

2015-12-19 Thread Cristiano Coelho

Aymeric is right. I do an insert with microseconds (since that's what 
django does right now) but mysql has the column defined as datetime(0), so 
it just strips the microsecond part, however, when doing the select, I'm 
expecting to get the value I have just inserted, but it doesn't work, since 
mysql doesn't strip microseconds from the select as it does for the insert. 
So this is really a mysql issue I guess...

About using a custom datetime field that strips microseconds, that won't 
work for raw queries I believe, not even .update statements as they ignore 
pre-save? As the stripping happens (or used to happen) at the sql query 
compile level.
This is really a bummer, because it seems like the only option is to 
convert all my datetime columns into datetime(6), which increases the table 
size and index by around 10%, for something I will never use.

Any other work around that can work with both normal and raw queries? 
Should I complain at mysql forums?

El sábado, 19 de diciembre de 2015, 7:39:12 (UTC-3), Shai Berger escribió:
>
> On Saturday 19 December 2015 11:23:17 Erik Cederstrand wrote: 
> > > Den 19. dec. 2015 kl. 16.01 skrev Aymeric Augustin 
> > > : 
> > > 
> > > To be fair, this has a lot to do with MySQL’s lax approach to storing 
> > > data. There’s so many situations where it just throws data away 
> happily 
> > > that one can’t really expect to read back data written to MySQL. 
> > > 
> > > That said, this change in Django sets up this trap for unsuspecting 
> > > users. Providing a way for users to declare which fields use the old 
> > > format won't work because of pluggable apps: they cannot know what 
> > > version of MySQL their users were running when they first created a 
> > > given datetime column. The best solution may be to provide a 
> conversion 
> > > script to upgrade all datetime columns from the old to the new format. 
> > 
> > One simple solution could be for Christiano to subclass the 
> DateTimeField 
> > to handle the microsecond precision explicitly. Something like this to 
> > strip: 
> > 
> > 
> > class DateTimeFieldWithPrecision(DateTimeField): 
> >def __init__(self, *args, **kwargs): 
> >self.precision = kwargs.get('precision', 6) 
> >assert 0 <= self.precision <= 6 
> > super().__init__(*args, **kwargs) 
> > 
> >def pre_save(self, model_instance, add): 
> >dt = getattr(model_instance, self.attname) 
> > 
> dt.replace(microsecond=int(dt.microsecond/10**(6-self.precision))) 
> >return dt 
> > 
>
> if I get the complaints correctly, something similar would need to be done 
> when preparing a value for querying. 
>
> More generally, I think Christiano just wants "the old field back" -- so, 
> he 
> has a use-case for a DateTimeField which explicitly does not use second 
> fractions. We already have a DateTimeField which explicitly does not use 
> day 
> fractions (DateField), so I suppose we could find sense in that... We 
> would 
> typically suggest, as Erik implicitly did, that such field be done outside 
> of 
> Django, but the backward-compatibility issues mentioned by Aymeric make it 
> quite plausible that such a field will be added to core or contrib. 
>
> Shai. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/78d8b177-5dbe-4ad1-a86f-063332591da7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] MySQL Microseconds stripping

2015-12-20 Thread Cristiano Coelho

Thanks for the suggestion, I think that work around might just add too much 
code, so I'm probably going the way of converting every datetime column of 
every table to datetime(6) and afford the extra storage (and probably a 
little performance impact ?).
I think the documented change might need a little more of attention, and 
mention something about that any equality query will stop working if you 
either don't strip microseconds or update datetime columns to datetime(6) 
(and not even datetime(3) will work...)


El domingo, 20 de diciembre de 2015, 6:48:20 (UTC-3), Erik Cederstrand 
escribió:
>
>
> > Den 20. dec. 2015 kl. 01.04 skrev Cristiano Coelho <cristia...@gmail.com 
> >: 
> > 
> > About using a custom datetime field that strips microseconds, that won't 
> work for raw queries I believe, not even .update statements as they ignore 
> pre-save? As the stripping happens (or used to happen) at the sql query 
> compile level. 
> > This is really a bummer, because it seems like the only option is to 
> convert all my datetime columns into datetime(6), which increases the table 
> size and index by around 10%, for something I will never use. Any other 
> work around that can work with both normal and raw queries? 
>
> While I understand that you'd like this to Just Work, you're sending 
> microseconds to the DB, knowing they will get lost, and expecting 
> comparisons to still work *with* microseconds. It's like expecting 12.34 == 
> int(12.34). 
>
> Why not strip the microseconds explicitly as soon as you're handed a 
> datetime with microseconds? That way you make it explicit that you really 
> don't want microseconds. That's less head-scratching for the next person to 
> work with your code. Just dt.replace(microsecond=0) all date values before 
> you issue a .filter(), .save(), .update(), .raw() or whatever. 
>
> > Should I complain at mysql forums? 
>
> You could try, but since Oracle took over, all my reports have been 
> answered with WONTFIX. Anyway, it'll be months or years before you get 
> something you can install on your server. 
>
> Erik

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/fc7b50a9-def6-409b-b75c-9c40c616731d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] MySQL Microseconds stripping

2015-12-21 Thread Cristiano Coelho

I think a simple setting allowing to use the old behaviour should be 
enough, shouldn't it? How does it handle other db backends? I'm not sure if 
oracle has an option for datetime precision, but if it does, it makes sense 
to have a global setting for datetime precision, as right now you are 
pretty much forced to always go with a precision of 6 (at least on mysql?) 
and that might be just too much if you want a simpler datetime.

El lunes, 21 de diciembre de 2015, 19:54:29 (UTC-3), Josh Smeaton escribió:
>
> I think this is a fairly big oversight that should be fixed in the most 
> backwards compatible way, so users don't need to change their code, or only 
> have to change it minimally. I'm with Aymeric here. Does Django have 
> visibility of the field constraints at insert/select queryset time? Ideally 
> Django would handle the differences transparently. If that's not possible 
> then we should have a migration or script that'll do the conversion on 
> behalf of users once off.
>
> ./manage.py mysql-upgrade-microseconds && ./manage.py migrate ?
>
>
> On Monday, 21 December 2015 19:39:44 UTC+11, Aymeric Augustin wrote:
>>
>> 2015-12-20 22:57 GMT+01:00 Cristiano Coelho <cristia...@gmail.com>:
>>
>>> Thanks for the suggestion, I think that work around might just add too 
>>> much code, so I'm probably going the way of converting every datetime 
>>> column of every table to datetime(6) and afford the extra storage (and 
>>> probably a little performance impact ?).
>>> I think the documented change might need a little more of attention, and 
>>> mention something about that any equality query will stop working if you 
>>> either don't strip microseconds or update datetime columns to datetime(6) 
>>> (and not even datetime(3) will work...)
>>>
>>
>> If that's the solution we end up recommending -- because the horse has 
>> left the barn months ago... -- then we must document it in detail.
>>
>> This is a large backwards incompatibility that may result in subtle bugs 
>> and requires non-trivial steps to fix. It doesn't live up to Django's 
>> standards.
>>
>> -- 
>> Aymeric.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/78eb038d-a11f-47ec-bf85-ef454341af01%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Deprecations vs. backwards-incompatible changes for tightening behavior that hides probable developer error

2015-12-21 Thread Cristiano Coelho

Hello, the select_related change was a really good one, after updating I 
found around 3 or 4 queries that had a typo in select_related which was 
obviously never noticed before. In this project finding those errors was 
not complicated at all, but I believe that on a big project that also has 
poor testing, that change might have given the team a few days/weeks of bug 
hunting.
Now the question is, even if it goes through a deprecation cycle, what if 
the team isn't even looking at debug warnings? The issues might still go 
undetected until it is too late and start crashing everywhere, so I'm not 
really sure how big of a difference it would have made.

El lunes, 21 de diciembre de 2015, 12:09:31 (UTC-3), Tim Graham escribió:
>
> I'd like to ask for opinions about whether or not deprecations are more 
> useful than making small backwards incompatible changes when they move 
> Django in a direction toward unhiding probable developer error.
>
> An example from a past release is the validation of fields in 
> select_related() [1]. This was done as a backwards-incompatible change in 
> 1.8: invalid fields immediately raised an error. Would you rather a change 
> like that go through the normal deprecation cycle (pending deprecation, 
> deprecation, error)? My reasoning against it is that it continues to hide 
> probable errors in new and old code (especially in the first release since 
> PendingDeprecationWarning isn't loud by default) and the cost for users of 
> fixing existing code is likely low compared to the overhead of a 
> deprecation cycle in Django. Did anyone find this was not the case in their 
> own projects?
>
> The question came up again with the behavior of the default error views 
> silencing TemplateDoesNotExist when a custom template name is specified [2].
>
> [1] 
> https://docs.djangoproject.com/en/dev/releases/1.8/#select-related-now-checks-given-fields
> [2] https://code.djangoproject.com/ticket/25697
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/cf412600-9f63-44ed-b867-393610e0d7f7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: delegating our static file serving

2015-12-30 Thread Cristiano Coelho

Just curious, about PaaS, I know AWS (Amazon) deploys python/django on 
apache (which is quite decent), and apache also serves static files 
decently. What would be wrong with this?
The idea of having Python serving static files (and potentially gziping 
it), when python is one of the slowest languages out there, I don't think 
it is really a good idea. You should be really serving static files with 
something implemented in C/C++ or similar high performant language (read 
nginx or apache). How wrong am I?

El viernes, 5 de diciembre de 2014, 5:19:45 (UTC-3), Aymeric Augustin 
escribió:
>
> Yes, I support this plan.
>
> "Serve your files with nginx!" doesn't fly in the age of PaaS.
>
> Serving static files with Django and having a CDN cache them is a 
> reasonable setup as far as I know.
>
> I don't know if the "probably insecure" argument still holds. Are there 
> some specific security risks in serving files that don't exist in serving 
> dynamic content? I'd say dynamic content is more difficult. This main 
> problem I can imagine when serving files is directory traversal. We already 
> have protections against this class of attacks in several features. (Usual 
> disclaimer: my background is security is mostly theoretical.)
>
> -- 
> Aymeric.
>
>
> 2014-12-05 6:33 GMT+01:00 Collin Anderson  >:
>
>> Hi All,
>>
>> I'm pretty interested in getting secure and _somewhat_ efficient static 
>> file serving in Django.
>>
>> Quick history:
>> 2005 - Jacob commits #428: a "static pages" view.  Note that this view 
>> should only be used for testing!"
>> 2010 - Jannis adds staticfiles. Serving via django is considered "grossly 
>> inefficient and probably insecure".
>> 2011 - Graham Dumpleton adds wsgi.file_wrapper to Gunicorn.
>> 2012 - Aymeric adds StreamingHttpResponse and now files are read in 
>> chunks rather than reading the entire file into memory. (No longer grossly 
>> inefficient IMHO.)
>>
>> I propose:
>> - Deprecate the "show_indexes" parameter of static.serve() (unless people 
>> actually use it).
>> - Have people report security issues to secu...@djangoproject.com 
>>  (like always)
>> - Audit the code and possibly add more security checks and tests.
>> - add wsgi.file_wrapper support to responses (5-line proof of concept: 
>> https://github.com/django/django/pull/3650 )
>> - support serving static files in production, but still recommend 
>> nginx/apache or a cdn for performance.
>> - make serving static files in production an opt-in, but put the view in 
>> project_template/project_name/urls.py
>>
>> I think it's a huge win for low-traffic sites or sites in the "just 
>> trying to deploy and get something live" phase. You can always optimize 
>> later by serving via nginx or cdn.
>> We already have the views, api, and logic around for finding and serving 
>> the correct files.
>> We can be just as efficient and secure as static/dj-static without 
>> needing to make people install and configure wsgi middleware to the 
>> application.
>> We could have staticfiles classes implement more complicated features 
>> like giving cache recommendations, and serving pre-gzipped files.
>>
>> Is this a good idea? I realize it's not totally thought through. I'm fine 
>> with waiting until 1.9 if needed.
>>
>> Collin
>>
>> On Saturday, November 29, 2014 6:07:05 PM UTC-5, Collin Anderson wrote:
>>>
>>> Hi All,
>>>
>>> I think doing something here is really good idea. I'm happy with any of 
>>> the solutions mentioned so far.
>>>
>>> My question is: what does static/dj-static do that our built-in code 
>>> doesn't do? What makes it more secure? It seems to me we're only missing is 
>>> wsgi.file_wrapper and maybe a few more security checks. Why don't we just 
>>> make our own code secure and start supporting it?
>>> Here's basic wsgi.file_wrapper support: https://github.com/django/
>>> django/pull/3650
>>>
>>> We could then, over time, start supporting more extensions ourselves: 
>>> ranges, pre-gziped files, urls with never-changing content, etc. That way 
>>> we get very, very deep django integration. It seems to me this is a piece 
>>> that a web framework should be able to support itself.
>>>
>>> Collin
>>>
>>>
>>> On Friday, November 28, 2014 9:15:03 AM UTC-5, Tim Graham wrote:

 Berker has worked on integrating gunicorn with runserver 
  so that we might be able 
 to deprecate our own homegrown webserver. Windows support for gunicorn is 
 supposedly coming soon 
 which 
 may actually make the idea feasible. This way we provide a more secure 
 solution out of the box (anecdotes indicate that runserver is widely used 
 in production despite our documentation warnings against doing so).

 On the pull request, Anssi had an idea to use dj-static 
  to serve static/media 
 files. My understanding is that we

Re: PostGres 9.5 Upsert

2016-01-09 Thread Cristiano Coelho

I agree! Also, does this already happen for the MySQL backend? MySQL has 
the insert on conflict update, that could work the same way.
However, if I'm not wrong, the docs states that the above methods have a 
race condition (obvious since right now it does two operations), but if the 
code would actually use native database operations, the race conditions 
might be gone for those cases, so that should probably be documented as 
well.

El viernes, 8 de enero de 2016, 21:13:26 (UTC-3), bliy...@rentlytics.com 
escribió:
>
> Hey Guys,
>
> Postgres 9.5 has added the functionality for UPSERT aka update or insert.  
> Any interest in aligning UPSERT on the db layer with the get_or_create or 
> update_or_create functionality in django?  Sounds like my company would be 
> interested in doing the work if the PR will get the traction.
>
> -Ben
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/ae38ba8e-3e79-47fb-92b9-dd305176c58e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving MSSQL and Azure SQL support on Django

2016-01-26 Thread Cristiano Coelho

I'm interested in the progress of this as well :)

Sorry I didn't read through all the posts, mostly the first ones about the 
idea.

I would like to know, have you guys decided on which adapter to use? I have 
had a project where we needed to connect to SQL Server from a linux machine 
(actually amazon lambda) and even worse, we couldn't install any library 
with dependencies on C code, so we used one that was implemented in pure 
python that worked very well (pytds if I'm not wrong), with ofcourse, not 
the best performance.
Why do I tell this? Because even if you want django to run on SQL Server, 
it doesn't really mean you want to run it on a Windows machine, actually, 
that would probably be a terrible idea (no ofense), since apache works 
horribly bad on Windows, and Linux is atually the best OS to run a web 
server with python code (either nginx or apache). So please keep this in 
mind when chosing a connector, since if it has C dependencies (which it 
will probably have, since the pure python ones are quite slow).

About if you need different connectors for Azure or SQLServer, I'm 'almost' 
sure you don't, we use azure or other cloud based sqlserver deployments 
with no problem with standard sqlserver connectors.

So basically, do not aim this towards making django more Windows friendly, 
but rather the actual SQL Server backend.

El lunes, 25 de enero de 2016, 22:59:07 (UTC-3), Fabio Caritas Barrionuevo 
da Luz escribió:
>
> is there any update about the progress of this?
>
> -- 
> Fábio C. Barrionuevo da Luz
> Palmas - Tocantins - Brasil - América do Sul
>
>
>
> Em terça-feira, 13 de outubro de 2015 18:12:55 UTC-3, Tim Graham escreveu:
>>
>> If anyone is interested in listening in on the meetings with Microsoft 
>> engineers (Wednesday and Thursday 9am-5pm Pacific), let me know and I'll 
>> send you the Skype link.
>>
>> On Friday, October 2, 2015 at 11:53:17 AM UTC-7, Meet Bhagdev wrote:
>>>
>>>
>>>
>>> On Thursday, October 1, 2015 at 12:32:25 PM UTC-7, Tim Graham wrote:

 Hi Meet,

 I was wondering

 1. If you have any progress updates since your last message?

>>> 
>>>
>>> * Yes, engineers on my team I are currently ramping up on the three 
>>> Django-SQL Server adapters*
>>>
>>>
>>>- *  Django-pymssql*
>>>- * Django-pyodbc-azure*
>>>- 
>>> * Django-mssql *
>>>
>>> * The goal is to have a thorough understanding of what’s good and 
>>> what’s bad with these adapters before the event. *
>>>

 2. If you have any further details on the schedule for the time in 
 Seattle in a week and a half? (including video conference details for 
 those 
 unable to attend in person)

>>>
>>>- *We will have a video conference link for Day 2 and Day 3. 
>>>Participants interested can join the conference stream from their 
>>> browser. 
>>>The conference room mics are only capable to a certain extent. Thus the 
>>>quality might be a little poor. *
>>>
>>>
>>>- *We are finalizing the detailed schedule this week and will post 
>>>it on this thread by next Friday.  *
>>>
>>>
>>> 3. If myself or the other attendees should do anything to prepare for 
 the meetings?

 *Here are some things that you should prepare before coming to 
>>> Seattle.*
>>>
>>> *-*
>>>
>>>
>>>- 
>>> * Have a clear understanding of the things that you need from 
>>>Microsoft to improve the SQL Server support on Django. We have resources 
>>> to 
>>>do the heavy lifting but need guidance. *
>>>- * Share with us the issues we can help fix (on the Django side 
>>>and on the Django-ORM(database) side). *
>>>
>>>
>>> Thanks!

 On Thursday, September 17, 2015 at 3:38:09 PM UTC-4, Tim Allen wrote:
>
> Hey team, as promised, here are the simple tests I put together to 
> benchmark pyodbc vs pymssql. Be kind, this was Python I wrote a long time 
> ago!
>
> https://github.com/FlipperPA/pyodbc-pymssql-tests
>
> I've included example output on the README. Very basic, but useful.
>
> On Wednesday, September 16, 2015 at 11:27:59 AM UTC-4, Tim Allen wrote:
>>
>> Thanks for all of your efforts, Aymeric, I've been following your 
>> project since its inception - I'm FlipperPA on GitHub.
>>
>> On Sunday, September 13, 2015 at 4:59:34 AM UTC-4, Aymeric Augustin 
>> wrote:
>>>
>>> Did you mean “pyodbc outperforms pymssql”? Or did you go with pyodbc 
>>> despite lower performance? (Or did I misread that?)
>>>
>>
>> We went with pyodbc, despite lower performance. I've been meaning to 
>> put the simple tests up on GitHub - making a note to do that this week.
>>
>> At the time we were looking at options, we couldn't find a stable 
>> Django option for pymssql. I should have been more clear about the time 
>> frame in which we were testing as well; this was right around the time

Re: Improving MSSQL and Azure SQL support on Django

2016-01-28 Thread Cristiano Coelho

Tim Allen,

What you said about compiling the C dependencies on a similar machine and 
then upload it all together indeed works (it was one of the options) but 
caused some other issues (ie: we usually develop on Windows, and also the 
compiled libraries are very platform specific) and performance was really 
not that important in this case. But just letting you know that your idea 
works most of the time if you are willing to take the extra work.

El jueves, 28 de enero de 2016, 12:48:29 (UTC-3), Tim Allen escribió:
>
> Thanks to everyone for their efforts; my workplace has a mix of SQL Server 
> and PostgreSQL, heavier on the SQL Server side. Due to some groups reliance 
> on SSIS and tight SQL Server integration with data vendors, that isn't 
> going to change any time soon, so this is project is one we're following 
> closely as well. We've tried to contribute by way of feedback, testing 
> various configurations with various drivers, some documentation and a 
> minuscule amount of code contribution.
>
> In case this anecdotal evidence helps anyone in the meantime, the stack 
> we've found most reliable these days (from RedHat / CentOS, at least, but 
> also partially tested on Ubuntu) is:
>
> - FreeTDS 0.95 (supports TDS version 7.3) with unixODBC. We tried the 
> Microsoft provided ODBC driver, but ran into quite a few issues, 
> particularly with multi-threading.
> - pyodbc 3.0.10. pyodbc just works. We get slightly better performance 
> with pymssql, but have found pyodbc to be more frequently updated and 
> rock-solid. The performance upgrade didn't warrant using pymssql in our 
> case, but is worth mentioning.
> - django-pyodbc-azure. This is kept up to date with Django and Python 
> release versions, and works with the least amount of configuration tweaking 
> that we have found.
>
> We're on a mix of RHEL/CentOS 6 and 7, and have gotten this stack running 
> reliably up to v7.2. YMMV, of course!
>
> As for the C dependencies, have you considered building the C binaries 
> necessary on another server, and then just including them as part of a 
> wheel (or something like that) for installation were you couldn't install? 
> This sound like a perfect use case for a temporary vagrant box you could 
> blow away after compiling. Just a thought that might give you the 
> performance you need without stepping on anyone's toes.
>
> On Wednesday, January 27, 2016 at 12:15:48 AM UTC-5, Cristiano Coelho 
> wrote:
>>
>> I'm interested in the progress of this as well :)
>>
>> Sorry I didn't read through all the posts, mostly the first ones about 
>> the idea.
>>
>> I would like to know, have you guys decided on which adapter to use? I 
>> have had a project where we needed to connect to SQL Server from a linux 
>> machine (actually amazon lambda) and even worse, we couldn't install any 
>> library with dependencies on C code, so we used one that was implemented in 
>> pure python that worked very well (pytds if I'm not wrong), with ofcourse, 
>> not the best performance.
>> Why do I tell this? Because even if you want django to run on SQL Server, 
>> it doesn't really mean you want to run it on a Windows machine, actually, 
>> that would probably be a terrible idea (no ofense), since apache works 
>> horribly bad on Windows, and Linux is atually the best OS to run a web 
>> server with python code (either nginx or apache). So please keep this in 
>> mind when chosing a connector, since if it has C dependencies (which it 
>> will probably have, since the pure python ones are quite slow).
>>
>> About if you need different connectors for Azure or SQLServer, I'm 
>> 'almost' sure you don't, we use azure or other cloud based sqlserver 
>> deployments with no problem with standard sqlserver connectors.
>>
>> So basically, do not aim this towards making django more Windows 
>> friendly, but rather the actual SQL Server backend.
>>
>> El lunes, 25 de enero de 2016, 22:59:07 (UTC-3), Fabio Caritas 
>> Barrionuevo da Luz escribió:
>>>
>>> is there any update about the progress of this?
>>>
>>> -- 
>>> Fábio C. Barrionuevo da Luz
>>> Palmas - Tocantins - Brasil - América do Sul
>>>
>>>
>>>
>>> Em terça-feira, 13 de outubro de 2015 18:12:55 UTC-3, Tim Graham 
>>> escreveu:
>>>>
>>>> If anyone is interested in listening in on the meetings with Microsoft 
>>>> engineers (Wednesday and Thursday 9am-5pm Pacific), let me know and I'll 
>>>> send you the Skype link.
>>>>
>>>> On Friday, October 2, 2015 at 11:53:17 AM UTC-7, Meet Bhagdev wrote:
>>>>>
>>>>

Admin Improvement - Autocomplete & improved search for FKs

2016-02-02 Thread Cristiano Coelho

Hello,

On one of my projects I'm using django-grappelli to improve the admin site
( https://github.com/sehmaschine/django-grappelli ) which was used before
the very nice face wash the admin interface received on 1.9. So right now
if I were to update to 1.9, the only thing I would use from this library
would be the autocomplete and improved search features for foreign key
models (includes inlines as well) and some other very minor features since
the 1.9 interface is quite nicer.

What's Autocomplete like? Basically, rather than a standard dropdown for
foreign keys, you have a textfield where you can search and it then
populate, the library implements it adding an additional required static
method to models so you indicate which fields should be searched through
(however in my opinion that should actually go into the ModelAdmin
definition) and then jqueryui autocomplete and some other javascript plus
required urls for this. I think anyone around knows how an auto complete
works.

Then for the improved search, which is a second option (you may use one,
both or none), the library adds a small search button next to the dropdown
(or textbox) which will open in a pop up (I actually don't like it being a
pop up at all) which simply contains the actual list page for that model
(the one from the FK dropdown/textbox) having all the advantages of the
features you implemented for that model (ie searching, filtering, sorting,
etc)
I have attached some screenshots of these two to help understand it.

There are other minor feature that are nice also, like the ability to
collapse inline models, or to use dropdowns for the filtering on the right
rather than displaying all values (which makes it really bad for big
tables).

So these two features are quite nice, but installing a complete external
library that is made for quite more (pretty much change everything on the
admin page) seems like a bad idea.
*Would it be worth it to have these two features implemented into django?*

There are also other projects just to add the autocomplete feature which I
haven't used nor tested (
https://django-autocomplete-light.readthedocs.org/en/master/ and
https://github.com/crucialfelix/django-ajax-selects ) so the feature looks
required.

--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/3cd2f357-d298-45ac-a8da-363798f0814f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Admin Improvement - Autocomplete & improved search for FKs

2016-02-03 Thread Cristiano Coelho

Nice, I guess my google skills are quite bad since I looked for it and 
didn't find it :)

Opened 5 years ago with progress but still not completed, looks like it may 
need some help

El miércoles, 3 de febrero de 2016, 9:13:54 (UTC-3), Tim Graham escribió:
>
> Here is an accepted ticket for autocomplete: 
> https://code.djangoproject.com/ticket/14370
>
> The "improved search" seems like what we already have when using 
> raw_id_fields so it seems like it should be easy to include as part of the 
> autocomplete UI. I don't see think it needs to be a separate option.
>
> On Tuesday, February 2, 2016 at 10:20:07 PM UTC-5, Cristiano Coelho wrote:
>>
>>
>> <https://lh3.googleusercontent.com/-RY84l0o3Ac0/VrFszlul9uI/AGg/0SxzpJ86MLs/s1600/improved_search.PNG>
>>
>>
>> <https://lh3.googleusercontent.com/-5tiY0Xk2bkk/VrFsJtgi8WI/AGc/lQmYfhzOUAo/s1600/autocomplete.png>
>> Hello,
>>
>> On one of my projects I'm using django-grappelli to improve the admin 
>> site ( https://github.com/sehmaschine/django-grappelli ) which was used 
>> before the very nice face wash the admin interface received on 1.9. So 
>> right now if I were to update to 1.9, the only thing I would use from this 
>> library would be the autocomplete and improved search features for foreign 
>> key models (includes inlines as well) and some other very minor features 
>> since the 1.9 interface is quite nicer.
>>
>> What's Autocomplete like? Basically, rather than a standard dropdown for 
>> foreign keys, you have a textfield where you can search and it then 
>> populate, the library implements it adding an additional required static 
>> method to models so you indicate which fields should be searched through 
>> (however in my opinion that should actually go into the ModelAdmin 
>> definition) and then jqueryui autocomplete and some other javascript plus 
>> required urls for this. I think anyone around knows how an auto complete 
>> works.
>>
>> Then for the improved search, which is a second option (you may use one, 
>> both or none), the library adds a small search button next to the dropdown 
>> (or textbox) which will open in a pop up (I actually don't like it being a 
>> pop up at all) which simply contains the actual list page for that model 
>> (the one from the FK dropdown/textbox) having all the advantages of the 
>> features you implemented for that model (ie searching, filtering, sorting, 
>> etc)
>> I have attached some screenshots of these two to help understand it.
>>
>> There are other minor feature that are nice also, like the ability to 
>> collapse inline models, or to use dropdowns for the filtering on the right 
>> rather than displaying all values (which makes it really bad for big 
>> tables).
>>
>> So these two features are quite nice, but installing a complete external 
>> library that is made for quite more (pretty much change everything on the 
>> admin page) seems like a bad idea. 
>> *Would it be worth it to have these two features implemented into django?*
>>
>> There are also other projects just to add the autocomplete feature which 
>> I haven't used nor tested ( 
>> https://django-autocomplete-light.readthedocs.org/en/master/ and 
>> https://github.com/crucialfelix/django-ajax-selects ) so the feature 
>> looks required.
>>
>>
>>
>>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/860e4e41-b893-4b8d-b0d3-17bc98dc1c67%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Announcing Django-Zappa - Serverless Django on AWS Lambda + API Gateway

2016-02-08 Thread Cristiano Coelho

Hello, I would like to suggest that you include the limitations AWS Lambda 
and API Gateway has, since I have used them and it is not suitable for 
every use case.

For example, one of the biggest limitations is that an Lambda can run for 
at most 5 minutes, but if paired with API Gateway, it can only run for 1 
minute before returning a time out, I know 1 minute is quite a lot, but if 
you have any kind of multimedia processing (like uploading a file, or 
downloading one that is generated on the server like a PDF) you will need 
to re write some code to use other amazon services such as S3.
Other limitations include things such as 100 concurrent requests at a time 
and 500 requests per second, payload for request and respose size limits of 
6mb and other things that can be read here 
http://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html

Some (most) of those limits can be increased by request to Amazon.

Leaving the limitations aside, AWS Lambda is really a great product I 
highly recommend!

El lunes, 8 de febrero de 2016, 11:23:47 (UTC-3), Rich Jones escribió:
>
> Hey guys!
>
> (I also made this post to django-users, but I think the discussion here 
> can be about ways that we can improve Django to work better on AWS Lambda. 
> Forgive the double-post.)
>
> I'm pleased to announce the release of Django-Zappa - a way to run 
> serverless Django on AWS Lambda + API Gateway.
>
> Now, with a single command, you can deploy your Django apps in an 
> infinitely scalable, zero-configuration and incredibly cheap way!
>
> Read the announcement post here: 
> https://gun.io/blog/announcing-zappa-serverless-python-aws-lambda/
> Watch a screencast here: 
> https://www.youtube.com/watch?v=plUrbPN0xc8=youtu.be
> And see the code here: https://github.com/Miserlou/django-zappa
>
> Comments, questions and pull requests are welcome!
>
> It seems quite performant already, but I bet there are ways that we can 
> improve Django to work better on Lambda. 
>
> Enjoy,
> Rich Jones
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d1c0510e-061b-4e7b-be7e-874d5ad61903%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Composite Primary Keys

2016-02-16 Thread Cristiano Coelho

Hello there,

What's the status for this? This 
(https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys) is 3 years 
old (last edit) and the links on it are even older. Googling around only 
gave me some very old projects so it wasn't good neither.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/b2873ece-39bb-4536-b23d-d988c7122204%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Django admin and messages

2016-02-20 Thread Cristiano Coelho

Hello, 

It seems that all admin "sucess" (and others) messages are hardcoded 
(almost, actually translations) into the methods that use them and can not 
be easily changed (like 'The %(name)s "%(obj)s" was added successfully. You 
may add another %(name)s below.').
This is causing some issues on languages with higher gender usage, like 
spanish, that you can get into texts like 'El Persona "Id: 123" .' 
(should be La Persona), so it would be convenient to be able to easily 
change these messages so you could just change it to "Object added 
successfully" or something to prevent the above issue.

Looking through the file django/contrib/admin/options.py you can see the 
huge usage of texts defined deep into the methods, which makes it 
impossible to override any of them.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/5410c38e-efd7-4379-a344-750400b5f552%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving django.setup() and app loading performance

2016-02-26 Thread Cristiano Coelho

If with "serverless" you are talking deployments such as Amazon Lambda or 
similar, I don't think setup is called on every request, at least for AWS 
Lambda, the enviorment is cached so it will only happen once each time it 
needs to scale up. Are there any other issues?

El viernes, 26 de febrero de 2016, 12:36:57 (UTC-3), Rich Jones escribió:
>
> (Originally posted this as a ticket, but we can have discussion here and 
> leave the ticket  for just 
> more specific discussion.)
>
> I imagine that this is an area that hasn't really been given much 
> consideration with regards to optimization, because it isn't relevant to 
> normal Django deployments. However, with "serverless" deployments (those 
> without any permanent infrastructure), this becomes quite relevant as we 
> have to call setup() every request. 
>
>
> So, I'd love to discuss ideas for performance optimizations to Django's 
> setup method. (A sample output for profile is available here: 
> https://github.com/Miserlou/django-zappa/issues/24 )
>
>
> For starters - can we load apps in parallel? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/92beeb53-b347-40b1-928f-3998929a3334%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving django.setup() and app loading performance

2016-02-26 Thread Cristiano Coelho

Rich,

I believe you know a lot way more than me about AWS Lambda since you have 
made such a great project, and I'm really interested to know how it really 
works since theyr documentation is a bit superficial.

On their FAQ this is what they state:

*Q: Will AWS Lambda reuse function instances?*

To improve performance, AWS Lambda may choose to retain an instance of your 
function and reuse it to serve a subsequent request, rather than creating a 
new copy. Your code should not assume that this will always happen.


I always thought, and with very simple testing seen, that your code is 
basically "frozen" between each request, and for example, every import that 
is done in the main module is always done once (so your whole program is 
only initialized once) so this means django would only initialize once for 
quite a while (until your instance of the code is discarded, and a new 
request will basically generate all modules to be imported again). So 
technically if you do the right imports to have django call setup(), this 
should be only done once. What is really happening that makes it always 
call setup() on every request?


With the above said, if that's really the case. Python is known to able to 
serialize classes in a very interesting way (pickling) where you can even 
send a class with its methods over the wire and on the other side the 
person can execute every method defined in there and the class also keeps 
any state. Would it be possible to store the state with this somehow?





El viernes, 26 de febrero de 2016, 17:53:00 (UTC-3), Rich Jones escribió:
>
> @Aymeric
> > In my opinion, the first concrete step would be to measure how much time 
> is spent executing Django code rather than importing Python modules.
>
> You can find a complete profile of a Django request as it goes through the 
> complete request/response loop here: 
> https://github.com/Miserlou/django-zappa/files/141600/profile.txt
>
> Over 70% of the total request time is spent in django.setup() - so you can 
> see why we have an incentive to improve this! 
>
>
> @ Cristiano - 
> > If with "serverless" you are talking deployments such as Amazon Lambda 
> or similar, I don't think setup is called on every request, at least for 
> AWS Lambda, the enviorment is cached so it will only happen once each time 
> it needs to scale up. Are there any other issues?
>
> You're halfway there, but the process is more complicated than that. The 
> code is cached, not the internal state of the machine. You can follow our 
> progress here: https://github.com/Miserlou/django-zappa/
>
> But - another type caching could be another possibility. Can anybody think 
> of any arrangements where we could perhaps call setup() with 
> "pre-loaded"/cached applications? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/5c84c0a5-0ecd-4c80-8560-919cdd7b0879%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving django.setup() and app loading performance

2016-02-29 Thread Cristiano Coelho

Sorry if this looks like a retarded question, but have you tried the setup 
calls to the top of the lambda_handler module? I'm not sure why you need 
the settings data as an argument to the lambda handler, but if you find a 
way to move those 4 lines near setup(), you will only load the whole django 
machinery once (or until AWS decides to kill your instance). I have a wild 
guess that this is related to the way you have implemented the "publish to 
aws" process. But the main issue here is that you are calling setup() on 
every request, where you really shouldn't be doing that, but rather do it 
at module level.

I'm sorry if this goes away too far from the actual thread, since this 
looks more like a response to a zappa forum :)



El lunes, 29 de febrero de 2016, 19:34:16 (UTC-3), Rich Jones escribió:
>
> For those who are still following along, these are the lines in question:
>
>
> https://github.com/Miserlou/django-zappa/blob/master/django_zappa/handler.py#L31
>
> https://github.com/Miserlou/django-zappa/blob/master/django_zappa/handler.py#L68
>
> It's very possible there are ways of significantly improving this. 
> Suggestions always welcome!
>
> R
>
> On Monday, February 29, 2016 at 11:26:17 PM UTC+1, Rich Jones wrote:
>>
>> Hey all!
>>
>> Let me clarify a few of the terms here and describe a bit how Django 
>> operates in this context.
>>
>> "Serverless" in this contexts means "without any permanent 
>> infrastructure". The server is created _after_ the request comes in, and it 
>> dies once the response is returned. The means that we never have to worry 
>> about server operations or horizontal scalability, and we pay far, far far 
>> less of a cost, as we only pay for server time by the millisecond. It's 
>> also radically easier to deploy - a single 'python manage.py deploy 
>> production' gives you an infinitely scalable, zero-maintenance  app. 
>> Basically, Zappa is what comes after Heroku.
>>
>> To do this, we use two services - Amazon Lambda and Amazon API Gateway.
>>
>> The first, AWS Lamdba - allows us to define any arbitrary function, which 
>> Amazon will then cache to memory and execute in response to any AWS system 
>> event (S3 uploads, Emails, SQS events, etc.) This was designed for small 
>> functions, but I've been able to squeeze all of Django into it.
>>
>> The other piece, API Gateway, allows us to turn HTTP requests into AWS 
>> events - in this case our Lambda context. This requires using a nasty 
>> language called 'VTL', but you don't need to worry about this.
>>
>> Zappa converts this API Gateway request into a 'normal' Python WSGI 
>> request, feeds it to Django, gets the response back, and performs some 
>> magic on it that lets it get back out through API Gateway.
>>
>> You can see my slides about this here: 
>> http://jsbin.com/movecayuba/1/edit?output
>> and a screencast here: https://www.youtube.com/watch?v=plUrbPN0xc8
>>
>> Now, this comes with a cost, but that's the trade off. The flip side is 
>> that it also means we need to call Django's 'setup()' method every time. 
>> All of this currently takes about ~150ms - the majority of which is spent 
>> setting up the apps. If we could do that in parallel, this would greatly 
>> increase the performance of every django-zappa request. Make sense?
>>
>>
>>
>> *We also have a Slack channel  
>> where we are working on this if you want to come by! *R
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/6acb21e3-701e-4aee-b7c7-7bd9a51f1ace%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Making max_length argument optional

2016-02-29 Thread Cristiano Coelho

I find that using TextField rather than CharField just to make postgres use 
varchar() is a terrible idea, if you are implementing an reusable app and 
it is used on a backend like MySQL where TextFields are created as text 
columns which are horribly inneficient and should be avoided at any cost 
you will have a really bad time.
I'm not sure about postgres but I want to believe that using varchar 
without limits has also some performance considerations that should be 
taken care of.

El lunes, 29 de febrero de 2016, 17:58:33 (UTC-3), Shai Berger escribió:
>
> Hi, 
>
> Thank you, Aymeric, for summing up the discussion this way. The division 
> into 
> two separate problems is indeed required, and I fully support the idea of 
> setting max_length's default to 100 or 120. 
>
> There seem to be just two points worth adding to your summary: 
>
> On Monday 29 February 2016 11:19:02 Aymeric Augustin wrote: 
> > 
> > 2) How can we make it easy for PostgreSQL users to just use VARCHAR()? 
> > 
> > Since this is a PostgreSQL-specific feature, having a variant of 
> CharField 
> > in django.contrib.postgres that supports and perhaps even defaults to 
> > unlimited length shouldn’t be controversial. 
> > 
>
> The first -- I believe it was raised very early on by Christophe Pettus -- 
> is 
> that Django already has a field that manifests on PG as VARCHAR(), and 
> that is 
> TextField. However, I don't like the idea that PG users should be using 
> TextField(widget=TextInput) as a replacement for CharField; I find that 
> counter-intuitive -- even if just because it is a "bad name". Names are 
> important. 
>
> The second -- in response to a comment made by Josh Smeaton -- is that 
> having 
> django.db.models.CharField with default max_lenth=N (for some finite N) 
> and 
> django.contrib.postgres.CharField with default max_length=None (meaning 
> infinity) sounds like a bad idea. 
>
> > 
> > I hope this helps! 
>
> I'm certain it did! 
>
> Shai. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/37cd6e9c-ec5e-4a69-99a5-84ca11853afe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving django.setup() and app loading performance

2016-02-29 Thread Cristiano Coelho

I'm almost sure that right now you are calling setup() with django already 
initialized in some cases where the enviorment is reused, I'm amazed django 
doesn't complain when setup() is called twice.

El lunes, 29 de febrero de 2016, 21:07:32 (UTC-3), Rich Jones escribió:
>
> Haven't tried that! I don't _think_ that'll work.. but worth a shot (I 
> don't think they only cache the handler.. I think they cache the whole 
> environment). Will report back! And as you're mentioning, we really 
> shouldn't be doing it every request, so if there were even a way to cache 
> that manually rather than calculate it every time, that'd be just as 
> valuable.
>
> There is no "zappa forum" - it's just me and a few other contributors in a 
> Slack channel! And all of this stuff is super new, and I'm sure that you 
> guys know a lot more about the Django internals than I do, so all 
> suggestions are welcome!
>
> R
>
> On Tuesday, March 1, 2016 at 12:57:28 AM UTC+1, Cristiano Coelho wrote:
>>
>> Sorry if this looks like a retarded question, but have you tried the 
>> setup calls to the top of the lambda_handler module? I'm not sure why you 
>> need the settings data as an argument to the lambda handler, but if you 
>> find a way to move those 4 lines near setup(), you will only load the whole 
>> django machinery once (or until AWS decides to kill your instance). I have 
>> a wild guess that this is related to the way you have implemented the 
>> "publish to aws" process. But the main issue here is that you are calling 
>> setup() on every request, where you really shouldn't be doing that, but 
>> rather do it at module level.
>>
>> I'm sorry if this goes away too far from the actual thread, since this 
>> looks more like a response to a zappa forum :)
>>
>>
>>
>> El lunes, 29 de febrero de 2016, 19:34:16 (UTC-3), Rich Jones escribió:
>>>
>>> For those who are still following along, these are the lines in question:
>>>
>>>
>>> https://github.com/Miserlou/django-zappa/blob/master/django_zappa/handler.py#L31
>>>
>>> https://github.com/Miserlou/django-zappa/blob/master/django_zappa/handler.py#L68
>>>
>>> It's very possible there are ways of significantly improving this. 
>>> Suggestions always welcome!
>>>
>>> R
>>>
>>> On Monday, February 29, 2016 at 11:26:17 PM UTC+1, Rich Jones wrote:
>>>>
>>>> Hey all!
>>>>
>>>> Let me clarify a few of the terms here and describe a bit how Django 
>>>> operates in this context.
>>>>
>>>> "Serverless" in this contexts means "without any permanent 
>>>> infrastructure". The server is created _after_ the request comes in, and 
>>>> it 
>>>> dies once the response is returned. The means that we never have to worry 
>>>> about server operations or horizontal scalability, and we pay far, far far 
>>>> less of a cost, as we only pay for server time by the millisecond. It's 
>>>> also radically easier to deploy - a single 'python manage.py deploy 
>>>> production' gives you an infinitely scalable, zero-maintenance  app. 
>>>> Basically, Zappa is what comes after Heroku.
>>>>
>>>> To do this, we use two services - Amazon Lambda and Amazon API Gateway.
>>>>
>>>> The first, AWS Lamdba - allows us to define any arbitrary function, 
>>>> which Amazon will then cache to memory and execute in response to any AWS 
>>>> system event (S3 uploads, Emails, SQS events, etc.) This was designed for 
>>>> small functions, but I've been able to squeeze all of Django into it.
>>>>
>>>> The other piece, API Gateway, allows us to turn HTTP requests into AWS 
>>>> events - in this case our Lambda context. This requires using a nasty 
>>>> language called 'VTL', but you don't need to worry about this.
>>>>
>>>> Zappa converts this API Gateway request into a 'normal' Python WSGI 
>>>> request, feeds it to Django, gets the response back, and performs some 
>>>> magic on it that lets it get back out through API Gateway.
>>>>
>>>> You can see my slides about this here: 
>>>> http://jsbin.com/movecayuba/1/edit?output
>>>> and a screencast here: https://www.youtube.com/watch?v=plUrbPN0xc8
>>>>
>>>> Now, this comes with a cost, but that's the trade off. The flip side is 
>>>> that it also means we need to call Django's 'setup()' method every time. 
>>>> All of thi

Re: Improving django.setup() and app loading performance

2016-02-29 Thread Cristiano Coelho

That's quite odd, I recall testing this once, where I created a lambda 
which had a datetime.now() at the top, and just returned that value. Out of 
a few calls, it returned two different results, meaning the module was re 
used "most" of the time. This was tested calling the lambda from the AWS 
Test itself and not through API Gateway, so perhaps API Gateway is 
preventing the module from being re used? Could be there anything else that 
might prevent AWS from re using the module?

El lunes, 29 de febrero de 2016, 21:27:28 (UTC-3), Rich Jones escribió:
>
> As I suspected, moving setup() outside of the handler had a negligible 
> effect - in fact the test showed a slight drop in performance. :(
>
> Testing from httping. From Berlin to US-East-1:
>
> Before:
> --- 
> https://arb9clq9k9.execute-api.us-east-1.amazonaws.com/unicode/json_example/ 
> ping statistics ---
> 52 connects, 52 ok, 0.00% failed, time 56636ms
> round-trip min/avg/max = 59.1/104.8/301.9 ms
>
> After:
> --- 
> https://arb9clq9k9.execute-api.us-east-1.amazonaws.com/unicode/json_example/ 
> ping statistics ---
> 51 connects, 51 ok, 0.00% failed, time 57306ms
> round-trip min/avg/max = 61.8/128.7/523.2 ms
>
> It was a nice thought though!
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/bfd3e770-2b09-4129-9ebd-cd1246a9c33d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving django.setup() and app loading performance

2016-02-29 Thread Cristiano Coelho

I have repeated the test this time through API Gateway, and out of many 
calls I only got two different dates that were instantiated at module 
level, meaning my module was only imported twice. I fail to see why it 
doesn't behave the same with your code.

El lunes, 29 de febrero de 2016, 21:33:05 (UTC-3), Cristiano Coelho 
escribió:
>
> That's quite odd, I recall testing this once, where I created a lambda 
> which had a datetime.now() at the top, and just returned that value. Out of 
> a few calls, it returned two different results, meaning the module was re 
> used "most" of the time. This was tested calling the lambda from the AWS 
> Test itself and not through API Gateway, so perhaps API Gateway is 
> preventing the module from being re used? Could be there anything else that 
> might prevent AWS from re using the module?
>
> El lunes, 29 de febrero de 2016, 21:27:28 (UTC-3), Rich Jones escribió:
>>
>> As I suspected, moving setup() outside of the handler had a negligible 
>> effect - in fact the test showed a slight drop in performance. :(
>>
>> Testing from httping. From Berlin to US-East-1:
>>
>> Before:
>> --- 
>> https://arb9clq9k9.execute-api.us-east-1.amazonaws.com/unicode/json_example/ 
>> ping statistics ---
>> 52 connects, 52 ok, 0.00% failed, time 56636ms
>> round-trip min/avg/max = 59.1/104.8/301.9 ms
>>
>> After:
>> --- 
>> https://arb9clq9k9.execute-api.us-east-1.amazonaws.com/unicode/json_example/ 
>> ping statistics ---
>> 51 connects, 51 ok, 0.00% failed, time 57306ms
>> round-trip min/avg/max = 61.8/128.7/523.2 ms
>>
>> It was a nice thought though!
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/a423a181-66c1-48da-89b2-b2cf32479351%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving django.setup() and app loading performance

2016-02-29 Thread Cristiano Coelho

Rich, I have just performed a real test, with a simple lambda and a 
datetime.now() defined at the top of the module as I said, and out of 100 
requests, this was the result:
{u'2016-03-01T00:37:30.476828': [43], u'2016-03-01T00:36:51.536025': [58]}
Where the date is the datetime.now() defined at the module top, and the 
number is the amount of times that same value was returned (oddly it sums 
101, I think I did an additional one). So the module is really being 
reused, even with API Gateway (I did the test from python to a remote 
address). So in your case there must be something else going on or I did 
this test wrongly


El lunes, 29 de febrero de 2016, 21:44:43 (UTC-3), Rich Jones escribió:
>
> That certainly could have something to do with it - there isn't very much 
> transparency about how API Gateway works. It's super new and pretty janky, 
> TBQH. However, I think that the behavior describing is not what's expected 
> - the caching seems to be for the assets of the whole environment, not of 
> anything that's computed - whether or not they are held in memory or read 
> from disk.
>
> [[ Also, obviously it's not a fair comparison, but I thought I'd include 
> these numbers for reference:
> --- http://djangoproject.com/ ping statistics ---
> 52 connects, 52 ok, 0.00% failed, time 68473ms
> round-trip min/avg/max = 227.9/329.3/1909.3 ms ]]
>
> So, I think the interesting things to explore would be:
>  - Loading apps in parallel
>  - "Pre-loading" apps before app deployment, then loading that cached 
> state at runtime. I guess I'd need to know more about what it means to 
> "load" an app to see if that makes any sense at all.
>
> I imagine the former is probably more feasible. I understand the aversion 
> to non-determinism, but I think that shouldn't matter as long as there is a 
> way to define inter-app dependencies.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/587ceaf9-9881-497c-a207-fc5f6c27199c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving django.setup() and app loading performance

2016-02-29 Thread Cristiano Coelho

I think I have found your issue, if I'm not wrong, django won't initialize 
twice if you call it twice (which was your previous behaviour) at least not 
completely, since apps registry has a "ready" flag.
Check this line: 
https://github.com/django/django/blob/master/django/apps/registry.py#L66

So basically having setup() on top or inside the handler, should yield 
almost the same results since django won't re load apps that are already 
loaded, and that's why the change didn't yield any different results. Does 
the slow down come from somewhere else rather than setup()

>From here doing 100 requests these are the results, times are quite higher 
since I have at least 200ms round trip from here to the USA.

min: 0.34s
max: 3.58
avg:  0.57s

El lunes, 29 de febrero de 2016, 22:21:11 (UTC-3), Rich Jones escribió:
>
> Hm. This is the downside of cloud services - we cannot look under the hood.
>
> Since I think that since this is something we _want_ cached, and because 
> it will make the function being executed shorter in length - it is a good 
> code decision to make. Thank you for the idea! However, it looks like the 
> actual result is negligible in terms of round-trip time.
>
> Maybe you can try with httping and show me your times? Or give 
> django-zappa a shot yourself! This is the app I'm testing with: 
> https://github.com/Miserlou/django-helloworld
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1e45f5b8-e40a-4d97-bc88-d139550426cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving django.setup() and app loading performance

2016-02-29 Thread Cristiano Coelho

In my opinion, those latencies are still a bit high, how much is really 
used on python/lambda code?  On a project of mine without hitting the 
database and django-rest-framework my times were around 1-4ms excluding any 
network latency. Debug and loggers might have high impact on your times if 
it is using cloud watch.

An interesting way of testing concurrency is to use apache benchmark. You 
can do it easily with an EC2 machine, ideally on the same region as your 
lambdas to reduce network latency to 0. There you can test on a nearly real 
scenario as this tool is quite good for load testing. Be careful to not do 
a huge test as you might start to get charged :D

I have added a github issue on zappa from reading your handler.py code with 
a few suggestions that may help you improve performance even further. Sorry 
that I'm lazy to make a real pull request.

El lunes, 29 de febrero de 2016, 22:47:43 (UTC-3), Rich Jones escribió:
>
> Ah, interesting! Thanks for tracking that down.
>
> In chat we basically discovered that the intercontinental latency and 
> shoddy wifi connections were responsible for a lot of the confusion.
>
> Testing from a US-based fiber connection, we got results of ~40ms in both 
> scenarios.
>
>  --- 
> https://arb9clq9k9.execute-api.us-east-1.amazonaws.com/unicode/json_example/ 
> ping statistics ---
> 16 connects, 16 ok, 0.00% failed, time 15989ms
> round-trip min/avg/max = 32.7/42.3/142.4 ms
>
> So, there you have it! As if you needed more of a reason to get down with 
> Zappa?! :-D
>
> (Now, that's with a naive-example. Expect this thread to get bumped when 
> we start really hitting the databases..)
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/5ced17b0-b5cc-44bb-aebe-3a7605c77945%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Question] jsi18n - get_javascript_catalog a bit obscure?

2016-03-01 Thread Cristiano Coelho

https://github.com/django/django/blob/master/django/views/i18n.py#L204

Can someone explain me why does it have to always load english as the first 
fallback?

Also, line 248:

 # If the currently selected language is English but it doesn't have a
 # translation catalog (presumably due to being the language translated
 # from) then a wrong language catalog might have been loaded in the
 # previous step. It needs to be discarded.

Looks unnecessary hard.

On a project, I had to monkey patch this method because I didn't add 
translations at all for my default language (not english) since the 
identifiers was enough, and for all the server side translations it worked 
fine. However, for javascript translations, since english was one of my 
translations, I would end up getting english translated strings when I was 
expecting the default identifier to be returned. So to fix it I just 
removed all the english and default languages loading and everything worked 
as expected. What would be the issues with this? Was there actually another 
approach?

Below is my slightly changed code of this method (I did it quite a few ago 
but surfaced again for some reason), I would like to know if it makes sense 
and the original code really has some obscure logic or I'm missing 
something important about it.

def get_javascript_catalog(locale, domain, packages):
default_locale = to_locale(settings.LANGUAGE_CODE)
app_configs = apps.get_app_configs()
allowable_packages = set(app_config.name for app_config in app_configs)
allowable_packages.add('django.conf')
packages = [p for p in packages if p in allowable_packages]
t = {}
paths = []
 
# paths of requested packages
for package in packages:
p = importlib.import_module(package)
path = os.path.join(os.path.dirname(upath(p.__file__)), 'locale')
paths.append(path)
# add the filesystem paths listed in the LOCALE_PATHS setting
paths.extend(list(reversed(settings.LOCALE_PATHS)))
 
   
locale_t = {}
for path in paths:
try:
catalog = gettext_module.translation(domain, path, [locale])
except IOError:
catalog = None
if catalog is not None:
locale_t.update(catalog._catalog)
 
if locale_t:
t = locale_t
plural = None
if '' in t:
for l in t[''].split('\n'):
if l.startswith('Plural-Forms:'):
plural = l.split(':', 1)[1].strip()
if plural is not None:
# this should actually be a compiled function of a typical plural-form:
# Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 
&& n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
plural = [el.strip() for el in plural.split(';') if 
el.strip().startswith('plural=')][0].split('=', 1)[1]
 
pdict = {}
maxcnts = {}
catalog = {}
for k, v in t.items():
if k == '':
continue
if isinstance(k, six.string_types):
catalog[k] = v
elif isinstance(k, tuple):
msgid = k[0]
cnt = k[1]
maxcnts[msgid] = max(cnt, maxcnts.get(msgid, 0))
pdict.setdefault(msgid, {})[cnt] = v
else:
raise TypeError(k)
for k, v in pdict.items():
catalog[k] = [v.get(i, '') for i in range(maxcnts[msgid] + 1)]
 
return catalog, plural






-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/bc9894cf-b657-4d10-94e8-9eea7a9234ed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] jsi18n - get_javascript_catalog a bit obscure?

2016-03-01 Thread Cristiano Coelho

Maybe I wasn't clear neither, but the main issue is this: when using a 
language equals to the default one, and if that language does not define 
any translation text (because ids are the same as values so it is not 
necessary), the server side translations will always correctly return the 
translated texts, while the javascript won't because it always has an 
english fallback.
Also the code I provided should actually load the default language 
translations as well (but not the english ones!) to make it behave exactly 
as the server side translations.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/52a1741e-5059-4d91-926b-1a2fa924fbcd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] jsi18n - get_javascript_catalog a bit obscure?

2016-03-01 Thread Cristiano Coelho

Looking through git history seems like the "always load english 
translations" code is quite a few years old.

There's a 5 y.o ticket in here: https://code.djangoproject.com/ticket/16284

Which leads to here: https://code.djangoproject.com/ticket/3594 with a fix 
that adds the "discard if english not found" which doesn't resolve the 
issue completely if you actually have english translations but it is not 
the default language.

There's also a link to here: 
https://groups.google.com/forum/#!topic/django-developers/1X_tPbhG_NQ 
proposing a change with no replies at all.

I couldn't really understand why is it that the hardcoded english language 
can not be removed from here.

In my opinion the code should be slightly changed to only load the default 
language rather than english (and as long as the requested language is not 
the same as the default one) as fallback, so it can match the actual server 
side behaviour, which will load the configured language (at settings) as a 
fallback language as long as it is different from it (an from english, 
which makes sense because django text ids are all in english).
This change could however affect people that relies on always having an 
english translation as fallback when the configured default language is not 
english (does this even make sense? Would anyone do that?)

After this change is done, there could be an improvement (for both js and 
server side translations) that some people might find useful, rather than 
always falling back to the default configured language, you could have a 
map of fallbacks, for example, if the user requests a portuguese language, 
but you only have english (default) and spanish (secondary), it makes more 
sense to fallback to spanish rather than english, but if the user requests 
russian, it makes more sense to fallback to english.

El martes, 1 de marzo de 2016, 21:29:42 (UTC-3), Tim Graham escribió:
>
> Have you tried looking through history with git blame to find related 
> tickets? Another tip is to search Google with a query like 
> "javascript_catalog site:code.djangoproject.com". This will let you find 
> tickets to see if the issue was raised before. This is how I try to answer 
> questions like this since many of the original authors are no longer active 
> (or at least, you can find the person who authored the code in question and 
> try a more directed query like a ping in IRC).
>
> On Tuesday, March 1, 2016 at 6:23:38 PM UTC-5, Cristiano Coelho wrote:
>>
>> Maybe I wasn't clear neither, but the main issue is this: when using a 
>> language equals to the default one, and if that language does not define 
>> any translation text (because ids are the same as values so it is not 
>> necessary), the server side translations will always correctly return the 
>> translated texts, while the javascript won't because it always has an 
>> english fallback.
>> Also the code I provided should actually load the default language 
>> translations as well (but not the english ones!) to make it behave exactly 
>> as the server side translations.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1387d9aa-7e51-43ff-a88c-f81242ce799a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Django admin and messages

2016-03-01 Thread Cristiano Coelho

Looking it deeper it seems mostly like a translation issue for the spanish 
(and maybe other) languages, since in some cases both gender articles are 
added ( "el/la" ) to make it generic but for the specific case I pointed 
above it is missing.


msgid ""
"The %(name)s \"%(obj)s\" was added successfully. You may edit it again 
below."
msgstr ""
"Se añadió con éxito el %(name)s \"%(obj)s. Puede editarlo de nuevo abajo."

Should be changed to

msgid ""
"The %(name)s \"%(obj)s\" was added successfully. You may edit it again 
below."
msgstr ""
"Se añadió con éxito el/la %(name)s \"%(obj)s. Puede editarlo/a de nuevo 
abajo."

The same happens for some other translations below this one. I couldn't 
find why is it different for some and not for others, but "el/la" is 
definitely better than "el" and having a female object following it.
Another approach for gender languages like spanish would be to use "el 
objeto %(obj)" rather than "el/la %(obj)".



El sábado, 20 de febrero de 2016, 16:51:27 (UTC-3), Cristiano Coelho 
escribió:
>
> Hello, 
>
> It seems that all admin "sucess" (and others) messages are hardcoded 
> (almost, actually translations) into the methods that use them and can not 
> be easily changed (like 'The %(name)s "%(obj)s" was added successfully. You 
> may add another %(name)s below.').
> This is causing some issues on languages with higher gender usage, like 
> spanish, that you can get into texts like 'El Persona "Id: 123" .' 
> (should be La Persona), so it would be convenient to be able to easily 
> change these messages so you could just change it to "Object added 
> successfully" or something to prevent the above issue.
>
> Looking through the file django/contrib/admin/options.py you can see the 
> huge usage of texts defined deep into the methods, which makes it 
> impossible to override any of them.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f0802f7e-f47f-4636-baa8-f9274f4eaeaa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Django admin and messages

2016-03-01 Thread Cristiano Coelho

Actual file with the issue: 
https://github.com/django/django/blob/master/django/contrib/admin/locale/es/LC_MESSAGES/django.po#L168

El martes, 1 de marzo de 2016, 22:42:05 (UTC-3), Cristiano Coelho escribió:
>
> Looking it deeper it seems mostly like a translation issue for the spanish 
> (and maybe other) languages, since in some cases both gender articles are 
> added ( "el/la" ) to make it generic but for the specific case I pointed 
> above it is missing.
>
>
> msgid ""
> "The %(name)s \"%(obj)s\" was added successfully. You may edit it again 
> below."
> msgstr ""
> "Se añadió con éxito el %(name)s \"%(obj)s. Puede editarlo de nuevo abajo."
>
> Should be changed to
>
> msgid ""
> "The %(name)s \"%(obj)s\" was added successfully. You may edit it again 
> below."
> msgstr ""
> "Se añadió con éxito el/la %(name)s \"%(obj)s. Puede editarlo/a de nuevo 
> abajo."
>
> The same happens for some other translations below this one. I couldn't 
> find why is it different for some and not for others, but "el/la" is 
> definitely better than "el" and having a female object following it.
> Another approach for gender languages like spanish would be to use "el 
> objeto %(obj)" rather than "el/la %(obj)".
>
>
>
> El sábado, 20 de febrero de 2016, 16:51:27 (UTC-3), Cristiano Coelho 
> escribió:
>>
>> Hello, 
>>
>> It seems that all admin "sucess" (and others) messages are hardcoded 
>> (almost, actually translations) into the methods that use them and can not 
>> be easily changed (like 'The %(name)s "%(obj)s" was added successfully. You 
>> may add another %(name)s below.').
>> This is causing some issues on languages with higher gender usage, like 
>> spanish, that you can get into texts like 'El Persona "Id: 123" .' 
>> (should be La Persona), so it would be convenient to be able to easily 
>> change these messages so you could just change it to "Object added 
>> successfully" or something to prevent the above issue.
>>
>> Looking through the file django/contrib/admin/options.py you can see the 
>> huge usage of texts defined deep into the methods, which makes it 
>> impossible to override any of them.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f626ae8c-905b-447c-a355-055caee7187d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Django admin and messages

2016-03-03 Thread Cristiano Coelho

By "we are doing" do you mean that's how it is translated by django, or are 
you patching translations on your projects? Any ideas why would it be 
different for spanish? It really burns my eyes seeing those grammar 
mistakes :D

El miércoles, 2 de marzo de 2016, 6:45:24 (UTC-3), Claude Paroz escribió:
>
> Le mercredi 2 mars 2016 02:42:05 UTC+1, Cristiano Coelho a écrit :
>>
>> Another approach for gender languages like spanish would be to use "el 
>> objeto %(obj)" rather than "el/la %(obj)".
>>
>
> That's exactly what we are doing for French.
>
> Claude 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/98cd7f1e-5369-41ab-bf8e-191f0e522c0a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Django admin and messages

2016-03-03 Thread Cristiano Coelho

The issue is that we do not force any particular es language, it is pretty 
much up to the browser so it can be es, es_AR, es_UY etc. And it is not 
really an issue with this spanish project (all susscess messages are not 
shown) but a translation issue with spanish translations in general

El jueves, 3 de marzo de 2016, 23:31:19 (UTC-3), Ramiro Morales escribió:
>
> On Thu, Mar 3, 2016 at 11:06 PM, Cristiano Coelho <cristia...@gmail.com 
> > wrote:
>
>> By "we are doing" do you mean that's how it is translated by django, or 
>> are you patching translations on your projects? Any ideas why would it be 
>> different for spanish? It really burns my eyes seeing those grammar 
>> mistakes :D
>>
>
> You could the es_AR translation. I think we have all these cases covered.
>
> -- 
> Ramiro Morales
> @ramiromorales
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/30f4fc65-7191-4e16-a5f3-d55c93491403%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] jsi18n - get_javascript_catalog a bit obscure?

2016-03-04 Thread Cristiano Coelho

Would a pull request be accepted? Does it need to be a new branch even if 
the change is just a few lines? Does it need an open ticket first?

El martes, 1 de marzo de 2016, 22:26:00 (UTC-3), Cristiano Coelho escribió:
>
> Looking through git history seems like the "always load english 
> translations" code is quite a few years old.
>
> There's a 5 y.o ticket in here: 
> https://code.djangoproject.com/ticket/16284
>
> Which leads to here: https://code.djangoproject.com/ticket/3594 with a 
> fix that adds the "discard if english not found" which doesn't resolve the 
> issue completely if you actually have english translations but it is not 
> the default language.
>
> There's also a link to here: 
> https://groups.google.com/forum/#!topic/django-developers/1X_tPbhG_NQ 
> proposing a change with no replies at all.
>
> I couldn't really understand why is it that the hardcoded english language 
> can not be removed from here.
>
> In my opinion the code should be slightly changed to only load the default 
> language rather than english (and as long as the requested language is not 
> the same as the default one) as fallback, so it can match the actual server 
> side behaviour, which will load the configured language (at settings) as a 
> fallback language as long as it is different from it (an from english, 
> which makes sense because django text ids are all in english).
> This change could however affect people that relies on always having an 
> english translation as fallback when the configured default language is not 
> english (does this even make sense? Would anyone do that?)
>
> After this change is done, there could be an improvement (for both js and 
> server side translations) that some people might find useful, rather than 
> always falling back to the default configured language, you could have a 
> map of fallbacks, for example, if the user requests a portuguese language, 
> but you only have english (default) and spanish (secondary), it makes more 
> sense to fallback to spanish rather than english, but if the user requests 
> russian, it makes more sense to fallback to english.
>
> El martes, 1 de marzo de 2016, 21:29:42 (UTC-3), Tim Graham escribió:
>>
>> Have you tried looking through history with git blame to find related 
>> tickets? Another tip is to search Google with a query like 
>> "javascript_catalog site:code.djangoproject.com". This will let you find 
>> tickets to see if the issue was raised before. This is how I try to answer 
>> questions like this since many of the original authors are no longer active 
>> (or at least, you can find the person who authored the code in question and 
>> try a more directed query like a ping in IRC).
>>
>> On Tuesday, March 1, 2016 at 6:23:38 PM UTC-5, Cristiano Coelho wrote:
>>>
>>> Maybe I wasn't clear neither, but the main issue is this: when using a 
>>> language equals to the default one, and if that language does not define 
>>> any translation text (because ids are the same as values so it is not 
>>> necessary), the server side translations will always correctly return the 
>>> translated texts, while the javascript won't because it always has an 
>>> english fallback.
>>> Also the code I provided should actually load the default language 
>>> translations as well (but not the english ones!) to make it behave exactly 
>>> as the server side translations.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/6f9ba995-8ca9-4b5f-bfb5-e60575abe3ed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Question] jsi18n - get_javascript_catalog a bit obscure?

2016-03-04 Thread Cristiano Coelho

Where do I get an account for Trac tickets? :) I guess I should still open 
a ticket first

El viernes, 4 de marzo de 2016, 18:31:41 (UTC-3), Tim Graham escribió:
>
> Sure, I mean... my impression as indicated by the lack of response to the 
> thread is that no one provides much expertise in javascript_catalog() these 
> days (although it's only been a couple days). I don't use it myself so I do 
> my best to review proposed changes and then we wait for bug reports. Maybe 
> this will be the start of your journey to being that expert.
>
> Quoting https://github.com/django/django/blob/master/CONTRIBUTING.rst, 
> "non-trivial pull requests (anything more than fixing a typo) without Trac 
> tickets will be closed! Please file a ticket 
> <https://code.djangoproject.com/newticket> to suggest changes."
>
> On Friday, March 4, 2016 at 4:11:08 PM UTC-5, Cristiano Coelho wrote:
>>
>> Would a pull request be accepted? Does it need to be a new branch even if 
>> the change is just a few lines? Does it need an open ticket first?
>>
>> El martes, 1 de marzo de 2016, 22:26:00 (UTC-3), Cristiano Coelho 
>> escribió:
>>>
>>> Looking through git history seems like the "always load english 
>>> translations" code is quite a few years old.
>>>
>>> There's a 5 y.o ticket in here: 
>>> https://code.djangoproject.com/ticket/16284
>>>
>>> Which leads to here: https://code.djangoproject.com/ticket/3594 with a 
>>> fix that adds the "discard if english not found" which doesn't resolve the 
>>> issue completely if you actually have english translations but it is not 
>>> the default language.
>>>
>>> There's also a link to here: 
>>> https://groups.google.com/forum/#!topic/django-developers/1X_tPbhG_NQ 
>>> proposing a change with no replies at all.
>>>
>>> I couldn't really understand why is it that the hardcoded english 
>>> language can not be removed from here.
>>>
>>> In my opinion the code should be slightly changed to only load the 
>>> default language rather than english (and as long as the requested language 
>>> is not the same as the default one) as fallback, so it can match the actual 
>>> server side behaviour, which will load the configured language (at 
>>> settings) as a fallback language as long as it is different from it (an 
>>> from english, which makes sense because django text ids are all in english).
>>> This change could however affect people that relies on always having an 
>>> english translation as fallback when the configured default language is not 
>>> english (does this even make sense? Would anyone do that?)
>>>
>>> After this change is done, there could be an improvement (for both js 
>>> and server side translations) that some people might find useful, rather 
>>> than always falling back to the default configured language, you could have 
>>> a map of fallbacks, for example, if the user requests a portuguese 
>>> language, but you only have english (default) and spanish (secondary), it 
>>> makes more sense to fallback to spanish rather than english, but if the 
>>> user requests russian, it makes more sense to fallback to english.
>>>
>>> El martes, 1 de marzo de 2016, 21:29:42 (UTC-3), Tim Graham escribió:
>>>>
>>>> Have you tried looking through history with git blame to find related 
>>>> tickets? Another tip is to search Google with a query like 
>>>> "javascript_catalog site:code.djangoproject.com". This will let you 
>>>> find tickets to see if the issue was raised before. This is how I try to 
>>>> answer questions like this since many of the original authors are no 
>>>> longer 
>>>> active (or at least, you can find the person who authored the code in 
>>>> question and try a more directed query like a ping in IRC).
>>>>
>>>> On Tuesday, March 1, 2016 at 6:23:38 PM UTC-5, Cristiano Coelho wrote:
>>>>>
>>>>> Maybe I wasn't clear neither, but the main issue is this: when using a 
>>>>> language equals to the default one, and if that language does not define 
>>>>> any translation text (because ids are the same as values so it is not 
>>>>> necessary), the server side translations will always correctly return the 
>>>>> translated texts, while the javascript won't because it always has an 
>>>>> english fallback.
>>>>> Also the code I provided should actually load the default language

Re: [ GSoC2016 ] Integration of django and angular2

2016-03-09 Thread Cristiano Coelho

In my opinion angular + django + django-rest-framework is a very powerful 
combo, and using django templates mixed with angular (for anything else 
than the index page) is really a bad idea due to templates being very slow 
and rendered by python. Angular must run against a 100% web api with its 
own template framework running everything client side. If I'm not wrong 
django-angular uses this angular + django templates aproach, would the one 
with angular2 be any different?

El miércoles, 9 de marzo de 2016, 17:33:49 (UTC-3), Вадим Горбачев escribió:
>
> Hello.
>
> I am a student of the St. Petersburg State Polytechnical University.
> But I am also a web-developer 
>  .
>
> Now the problem of integration of django and angular2 is very interesting 
> to me.
> And I want to submit the application in this direction.
>
> *Reason:*
>  There are many decisions on the basis of django + angularJS, 
>  and it is good to aggregate this knowledge that will help many developers 
> not to step on the same rake.
>  Very big contribution to the solution of this problem can be found here 
> . 
>
>  But with an exit of ECMAscript6 and angular2, there is a wish to pass to 
> them somewhat quicker.
>  ECMAscript6 is awesome!
>
> *Question:*
>  Whether there will be interestingly this task to community?
>  Or to look for a work subject more approximate to the offered list?
>
> thanks in advance!
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/2428ca85-ebbe-40d2-ac95-edb7a5dca3a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Improving MSSQL and Azure SQL support on Django

2016-03-09 Thread Cristiano Coelho

"Improve documentation/examples [decrease confusion]: There's already so 
much awesome content out there on getting started with Django (but not many 
are referencing MSSQL as the db of choice or why MSSQL is a great option)."

I wouldn't think of MSSQL as a great option for django at least until it is 
supported natively and not through 3rd party apps which are always behind 
django updates.

El martes, 8 de marzo de 2016, 23:20:58 (UTC-3), Vin Yu escribió:
>
> Hey Tim,
>
> We've gotten lots of questions about the tools when we announced SQL 
> Server on Linux. I am curious; what are the DB management/development tasks 
> that are being performed by your coworkers? What are they using SSMS for? I 
> am interested in learning more. [Perhaps we can follow up by email as this 
> seens off-topic here :) ] 
>
> In terms of strengthening the story for MSSQL-Django, I think there is a 
> little bit of both difficulty and confusion over options; here are some 
> ideas that we are working on and could solve these issues:
>
>- Improve documentation/examples [decrease confusion]: There's already 
>so much awesome content out there on getting started with Django (but not 
>many are referencing MSSQL as the db of choice or why MSSQL is a great 
>option).
>- Improve getting started experience [decrease difficulty]: Getting 
>MSSQL for development (free and easy/fast set up) is hard today;this is on 
>MSFT to improve this experience.
>
> We want to help provide better developer experiences for those who want to 
> create new Django apps + MSSQL databases and if MSSQL were in the core, it 
> would definitely help with that. This would increase usage and is something 
> we are striving to achieve. We will continue to work with the community to 
> make this happen.  
>
> =) , 
> Vin
>
>
> On Tuesday, 8 March 2016 10:13:34 UTC-8, Tim Allen wrote:
>>
>> [slightly off-topic] I'm wondering if this will extend to SQL Server 
>> Management Studio. While I'm mainly a command line basher, many of 
>> coworkers are married to the GUI. I've found SSMS blows the competition out 
>> of the water when it comes to DB management GUIs. I'm wondering if this 
>> means SSMS will run on Linux (or Mac) eventually.
>>
>> This is certainly very big news. I wouldn't be shocked to some day see 
>> Windows itself running on the Linux Kernel.
>>
>> Meet, how can we help strengthen the story for MSSQL-Django? It seems we 
>> have a chicken and egg problem here. A very small amount of Django sites 
>> use SQL Server, but is that because of the difficulty in the available 
>> stack and confusion over options? Would usage increase if provided in core?
>>
>> On Monday, March 7, 2016 at 6:03:29 PM UTC-5, Josh Smeaton wrote:
>>>
>>> Wow, that's really great news! I haven't used mssql for a number of 
>>> years but it was always very nice to work with. Having it available to run 
>>> on linux will make it much easier for the Django community to test against 
>>> mssql, provided we're able to get/develop an appropriate driver and 
>>> backend. 
>>>
>>> Cheers
>>>
>>> On Tuesday, 8 March 2016 09:37:06 UTC+11, Meet Bhagdev wrote:

 Hi all,

 On interacting with several Django developers and committers, one of 
 the questions often came up, can I use SQL Server on non Window OS's? I 
 wanted to share that today Microsoft announced SQL Server availibility on 
 Linux - 
 https://blogs.microsoft.com/blog/2016/03/07/announcing-sql-server-on-linux/
 . 

 While there is still work needed to strengthen the MSSQL-Django story, 
 we hope this aids more Linux developers to give SQL Server a shot. Let me 
 know of your thoughts and questions :)

 Cheers,
 Meet

 On Monday, February 22, 2016 at 4:54:38 PM UTC-8, Vin Yu wrote:
>
> Hey Folks, 
>
> My name is Vin and I work with Meet in the Microsoft SQL Server team. 
> Just wanted to let you all know we are still looking into how we can 
> better 
> improve and support MSSQL for the Django framework. We’ll continue to 
> sync 
> with Michael and let you know of any updates soon. 
>
> Christiano and Tim - thanks for sharing your interest and sharing how 
> you are using Django with MSSQL. It's great to learn from your scenarios. 
>
> If you have any concerns, questions or comments feel free to reach out 
> to me at vinsonyu[at]microsoft.com



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit

Re: MySQL data loss possibility with concurrent ManyToManyField saves

2016-03-14 Thread Cristiano Coelho

The django-admin interface is quite bad at handling concurrent 
modifications, this is one problem that might not happen on other backends 
and is quite critical, but other issues (that ain't critical like data loss 
but might cause unexpected errors) like two people viewing a model with 
some inline relations, one of them deletes one of the inline rows, the 
other one performs a save, it will try to re-save the one that's actually 
deleted and you end up with a "please correct the errors below" but you 
actually see no errors (because the row you tried to save no longer exists 
and is no longer loaded). Similar errors happen with other kind of 
concurrent modifications.

As of the MySQL issue, using a higher isolation level might fix the issue 
but I would strongly recommend not changing it as this is mainly an issue 
of how django performs the many-to-many save and not MySQL's fault at all. 
If you still go this path, it would be a good idea to have the isolation 
level change not set globally (on settings) but have it as an argument to 
the transaction.atomic() function to be able to change it per transaction.

Would it work if rather than performing a DELETE, SELECT, INSERT combo, you 
perform a DELETE and INSERT IGNORE ? I believe it is supported by MySQL and 
PostgreSQL has a similar functionallity starting on 9.5 but can be emulated 
through locks or nested queries.

What if there could be a way to store the last modified date of an object 
(check it's history maybe?), save that date locally on the admin form, and 
when performing the actual save compare those two and prompt an error if 
they do not match, that would prevent most concurrent modification issues 
(although make it a bit more anoying for edits that usually do not have any 
issue), although you might still be prone to a small race condition on the 
modification date, unless yo uperform a select for update on it first.



El lunes, 14 de marzo de 2016, 12:15:31 (UTC-3), Tim Graham escribió:
>
> Could some MySQL users take a look at ticket #26347 [0] and recommend how 
> to proceed? I think it's probably not a new issue but I'm a bit surprised 
> it hasn't come up before if so.
>
> [0] https://code.djangoproject.com/ticket/26347
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/01264edf-73db-4533-8210-bc35448029a8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: MySQL data loss possibility with concurrent ManyToManyField saves

2016-03-14 Thread Cristiano Coelho

I can't tell if it is a bug on MySQL or not, but I did understand the same 
as you (the first example with Session A and Session B makes it more clear) 
so I can't quite understand how did the poster get that issue. I would like 
a similar example as the one in the docs, but with a delete in the middle, 
it is a bit unclear what happens, could a delete in the middle create a new 
"snapshot" for your next select?


So assuming the issue is real and not a bug in MySQL:

- Would a DELETE + INSERT IGNORE fix the issue, while also improving 
many-to-many updates to backends that support INSERT IGNORE? (one less 
select)

- Would a different isolation level help? What about SERIALIZABLE? In order 
to do this, the only good way to do this is to have the option to change 
serialization level on each transaction, improving the transaction.atomic() 
method so we do not affect other db operations changing the isolation level 
globally.

- Force a lock to the many-to-many updates (using select for update maybe?) 
?

- Use the object history to perform date matching and detect concurrent 
modifications of the same object before we do them and raise an error?


El lunes, 14 de marzo de 2016, 20:12:29 (UTC-3), Shai Berger escribió:
>
> Hi, 
>
> I just commented on the ticket, but wanted to clarify a few things here: 
>
> On Tuesday 15 March 2016 00:48:02 Cristiano Coelho wrote: 
> > The django-admin interface is quite bad at handling concurrent 
> > modifications, this is one problem that might not happen on other 
> backends 
> > and is quite critical, but other issues (that ain't critical like data 
> loss 
> > but might cause unexpected errors) like two people viewing a model with 
> > some inline relations, one of them deletes one of the inline rows, the 
> > other one performs a save, it will try to re-save the one that's 
> actually 
> > deleted and you end up with a "please correct the errors below" but you 
> > actually see no errors (because the row you tried to save no longer 
> exists 
> > and is no longer loaded). Similar errors happen with other kind of 
> > concurrent modifications. 
>
> Right. 
>   
> > As of the MySQL issue, using a higher isolation level might fix the 
> issue 
>
> For the record, the suggestion made was to use a *lower* isolation level, 
> to 
> prevent the failure of transactions; I disagree with that recommendation, 
> as 
> it "prevents" the failure by enabling the kinds of data corruption you 
> described above. 
>
> > but I would strongly recommend not changing it as this is mainly an 
> issue 
> > of how django performs the many-to-many save and not MySQL's fault at 
> all. 
>
> ... and there I completely disagree. Django handles the updates in a 
> transaction; first it deletes records, then (if they don't exist) writes 
> them. 
> The MySql docs explain that records are "snapshotted" at the beginning of 
> the 
> transaction, and so may actually be stale later on in it -- but that does 
> not 
> hold when the changes to the records are done *in* the transaction. 
> I asked the ticket's OP to describe a scenario where this is anything but 
> a 
> severe MySql bug, and I ask the same of you. 
>
> Shai. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1df2641a-cdd9-49b9-b3fa-56dbe707fd8d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

FileField and ImageField

2016-03-19 Thread Cristiano Coelho

I am recently trying to make an aws S3 storage (I know there are a few
libraries in there but I needed some customization). The storage works fine
so far!

However, there are some implementation details on FileField and ImageField,
in particular the function
generate_filename
https://github.com/django/django/blob/master/django/db/models/fields/files.py#L310

that is called before saving a model (and its file/image field). That code
assumes the file uses standar file paths based on your current operative
system. This creates quite some complications on S3 for example where all
the paths to simulate folders uses a "/", so if you are in Windows and your
file name is something like "hello/my/file.txt" it will overwrite your file
name to be "hello\\my\\file.txt" which is something I don't want as I have
explicitly created an upload_to callable to generate a very specific file
name (and that it shouldn't be changed).
Looking deeper it seems like the whole FileField implementation relies on a
directory and file structure, which might not be always right.

How easy would it be to add customization to this? Is your only option
defining your own FileFilend and ImageField classes that override this
method?

--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/35f921f4-9c22-40fa-bbea-ebfcb4c5d91f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: FileField and ImageField

2016-03-19 Thread Cristiano Coelho

To add a bit more about this, it seems that FileField is really meant to be 
working with an OS file system, making it harder to use a custom Storage 
that sends data to somewhere like AWS S3 where basically everything is a 
file (there are no real folders, just key prefixes)

These 3 functions inside FileField are the culprits:

def get_directory_name(self):
return 
os.path.normpath(force_text(datetime.datetime.now().strftime(force_str(self.upload_to

def get_filename(self, filename):
return 
os.path.normpath(self.storage.get_valid_name(os.path.basename(filename)))

def generate_filename(self, instance, filename):
# If upload_to is a callable, make sure that the path it returns is
# passed through get_valid_name() of the underlying storage.
if callable(self.upload_to):
directory_name, filename = os.path.split(self.upload_to(instance, 
filename))
filename = self.storage.get_valid_name(filename)
return os.path.normpath(os.path.join(directory_name, filename))

return os.path.join(self.get_directory_name(), 
self.get_filename(filename))



They basically destroy any file name you give to it even with upload_to. 
This is not an issue on a storage that uses the underlying file system, but 
it might be quite an issue on different systems, in particular if file 
names are using slashes as prefixes.

So what I did was to override it a bit:

class S3FileField(FileField):
 
def generate_filename(self, instance, filename):
# If upload_to is a callable, make sure that the path it returns is
# passed through get_valid_name() of the underlying storage.
if callable(self.upload_to):
filename = self.upload_to(instance, filename)
filename = self.storage.get_valid_name(filename)
return filename

return self.storage.get_valid_name(filename)


And all S3 issues gone! I wonder if this is the best way to do it. It would 
be great to have an additional keyword argument or something on the File 
(and image) fields to let the above functions know that they should not 
perform any OS operation on paths but seems like it would cause a lot of 
trouble.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/04eea39c-311d-4055-a769-3a9830ce08f5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: FileField and ImageField

2016-03-20 Thread Cristiano Coelho

I agree with you, generate_filename should just call the field upload_to 
and then delegate the whole name generation to the storage.

There's another thing about file storage that is troubling me: 
https://github.com/django/django/blob/master/django/core/files/storage.py#L57

The docs state you are not suposed to override the save method and just 
implement _save, however, doing a complete random rename at the send pretty 
much forces you to always override save instead. That name replacement is 
probably to save the file name in a way you can easily create the URL, but 
again, that should be delegated to the storage.url method shouldn't it?

There's one more thing I noticed (gonna open a ticket about this one) is 
that the proxy class for FileField (FieldFile, what a name!) states in the 
docs that in order to call its save method you need a django File object 
rather than a simple file-like object, I can understand that's because the 
save method uses the .size property 
(https://github.com/django/django/blob/master/django/db/models/fields/files.py#L92)
 
to save the file size into a _size variable. It doesn't seem anywhere in 
the code the _size is being used as size is a property that gets the file 
size either from the storage class or the actual file. So removing that 
line, could also allow us to use normal files in the save method.

El miércoles, 16 de marzo de 2016, 23:41:58 (UTC-3), Josh Smeaton escribió:
>
> It seems like FileField should delegate some of these methods to an 
> underlying Storage backend, no? I don't know what the implications to 
> back-compat would be, but the idea seems like a sensible one to start with. 
> The storage backend API may need to grow some additional methods to 
> verify/validate paths and filenames or it might already have the correct 
> methods needed for FileField to work. Fields should do all of their 
> path/storage IO via their storage object though.
>
>
> On Thursday, 17 March 2016 12:16:00 UTC+11, Cristiano Coelho wrote:
>>
>> To add a bit more about this, it seems that FileField is really meant to 
>> be working with an OS file system, making it harder to use a custom Storage 
>> that sends data to somewhere like AWS S3 where basically everything is a 
>> file (there are no real folders, just key prefixes)
>>
>> These 3 functions inside FileField are the culprits:
>>
>> def get_directory_name(self):
>> return 
>> os.path.normpath(force_text(datetime.datetime.now().strftime(force_str(self.upload_to
>>
>> def get_filename(self, filename):
>> return 
>> os.path.normpath(self.storage.get_valid_name(os.path.basename(filename)))
>>
>> def generate_filename(self, instance, filename):
>> # If upload_to is a callable, make sure that the path it returns is
>> # passed through get_valid_name() of the underlying storage.
>> if callable(self.upload_to):
>> directory_name, filename = 
>> os.path.split(self.upload_to(instance, filename))
>> filename = self.storage.get_valid_name(filename)
>> return os.path.normpath(os.path.join(directory_name, filename))
>>
>> return os.path.join(self.get_directory_name(), 
>> self.get_filename(filename))
>>
>>
>>
>> They basically destroy any file name you give to it even with upload_to. 
>> This is not an issue on a storage that uses the underlying file system, but 
>> it might be quite an issue on different systems, in particular if file 
>> names are using slashes as prefixes.
>>
>> So what I did was to override it a bit:
>>
>> class S3FileField(FileField):
>>  
>> def generate_filename(self, instance, filename):
>> # If upload_to is a callable, make sure that the path it returns is
>> # passed through get_valid_name() of the underlying storage.
>> if callable(self.upload_to):
>> filename = self.upload_to(instance, filename)
>> filename = self.storage.get_valid_name(filename)
>> return filename
>>
>> return self.storage.get_valid_name(filename)
>>
>>
>> And all S3 issues gone! I wonder if this is the best way to do it. It 
>> would be great to have an additional keyword argument or something on the 
>> File (and image) fields to let the above functions know that they should 
>> not perform any OS operation on paths but seems like it would cause a lot 
>> of trouble.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, se

Re: [GSoC 2016]Proposal: Validity check at client and dynamic form framework

2016-03-20 Thread Cristiano Coelho

The client side validation is a very good idea, other frameworks such as 
ASP.NET MVC already has some basic client side validation tied to model 
fields (equivalent to django forms) and also provides a very easy way to 
add custom javascript validation and tie to the model/form.

For the second part about dynamic forms, it really looks complicated to 
implement, looks like a job for more than one person.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/62a566fd-f30f-4238-872e-55c36441d44f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: MySQL data loss possibility with concurrent ManyToManyField saves

2016-03-20 Thread Cristiano Coelho

What performance changes can you expect doing this change? It is probably 
that default on MySQL for a good reason.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/8fe33fc0-5a55-479b-99b4-fbebc3bcf128%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [GSoC 2016]Proposal: Validity check at client and dynamic form framework

2016-03-21 Thread Cristiano Coelho

I don't have enough enough knowledge on how GSoC works and what the django 
team expects from people proposing projects but if it was me I would rather 
have a very well implemented, tested and with a good level of customization 
client side validation than two not so well features but that's up to the 
times used. I'm sure someone more related to this will be able to give you 
a better answer but the two ideas are definitely interesting :)


El domingo, 20 de marzo de 2016, 23:52:32 (UTC-3), 7sDream escribió:
>
> Thanks for your advice! 
>
> My first thought is the client side validation too. Because I used some 
> Forms in my Django project, after write javascript for each form, I feel 
> this is redundant.
>
> Then I thought of another common demand of Form, the dynamic form. After I 
> draw the chart, I also find it is a little complicated so I arranged for 
> eight weeks to finish it. 
>
> Do you think I should focus on client side validation (delete dynamic form 
> part)? Or describe the complexity of dynamic form and give a more detailed 
> schedule?
>
> Thanks again!
>
> 在 2016年3月21日星期一 UTC+8上午2:23:16，Cristiano Coelho写道：
>>
>> The client side validation is a very good idea, other frameworks such as 
>> ASP.NET MVC already has some basic client side validation tied to model 
>> fields (equivalent to django forms) and also provides a very easy way to 
>> add custom javascript validation and tie to the model/form.
>>
>> For the second part about dynamic forms, it really looks complicated to 
>> implement, looks like a job for more than one person.
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d73fce50-82a6-4404-8f89-a7f49fe42d3a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: MySQL data loss possibility with concurrent ManyToManyField saves

2016-03-21 Thread Cristiano Coelho

Shai Berger, this explanation is pure gold! Definetely better than MySQL's 
one. 

Now I may agree on that changing the isolation level should be probably 
left to a major release, or either have a huge warning on the patch notes. 
I personally can't think of any project I have done that this would be an 
issue since database errors are usually well handled (which I think doing 
this change would increase, since you will be reading more modified data 
from other transactions if I read correct) and critical operations should 
be handled with either a select for update or transactional operations 
(such as doing col = col + 1).

El lunes, 21 de marzo de 2016, 4:27:59 (UTC-3), Shai Berger escribió:
>
> First of all, I would like to say that I strongly support the move to READ 
> COMITTED, including backporting it to 1.8.x. 
>
> But we also need to explain: REPEATABLE READ is a higher transaction 
> isolation 
> level than READ COMMITTED. If you have problematic code, it should lead to 
> more deadlocks and/or transactions failing at commit time (compared to 
> READ 
> COMMITTED), not to data loss. The reason we get data losses is MySql's 
> unique 
> interpretation of REPEATABLE READ. If you're interested in the details 
> (and if 
> you use MySql, you should be), read on. 
>
> With MySql's REPEATABLE READ, the "read" operations -- SELECT statements 
> -- 
> indeed act like they act in the usual REPEATABLE READ: Once you've read 
> some 
> table, changes made to that table by other transactions will not be 
> visible 
> within your transaction. But "write" operations -- UPDATE, DELETE, INSERT 
> and 
> the like -- act as if they're under READ COMMITTED, affecting (and 
> affected by) 
> changes committed by other transactions. The result is, essentially, that 
> within a transaction, the reads are not guaranteed to be consistent with 
> the 
> writes [1]. 
>
> In particular, in the bug[2] that caused this discussion, we get the 
> following   
> behavior in one transaction: 
>
> (1) BEGIN TRANSACTION 
>
> (2) SELECT ... FROM some_table WHERE some_field=some_value 
> (1 row returned) 
>
> (3) (some other transactions commit) 
>
> (4) SELECT ... FROM some_table WHERE some_field=some_value 
> (1 row returned, same as above) 
>
> (5) DELETE some_table WHERE some_field=some_value 
> (answer: 1 row deleted) 
>
> (6) SELECT ... FROM some_table WHERE some_field=some_value 
> (1 row returned, same as above) 
>
> (7) COMMIT 
> (the row that was returned earlier is no longer in the 
> database) 
>
> Take a minute to read this. Up to step (5), everything is as you would 
> expect; 
> you should find steps (6) and (7) quite surprising. 
>
> This happens because the other transactions in (3) deleted the row that is 
> returned in (2), (4) & (6), and inserted another one where 
> some_field=some_value; that other row is the row that was deleted in (5). 
> The 
> row that this transaction selects was not seen by the DELETE, and hence 
> not 
> changed by it, and hence continues to be visible by the SELECTs in our 
> transaction. But when we commit, the row (which has been deleted) no 
> longer 
> exists. 
>
> I have expressed elsewhere my opinion of this behavior as a general 
> database 
> feature, and feel no need to repeat it here; but I think that, if 
> possible, it 
> is Django's job as a framework to protect its users from it, at least as a 
> default. 
>
> On Monday 21 March 2016 02:25:37 Cristiano Coelho wrote: 
> > What performance changes can you expect doing this change? It is 
> probably 
> > that default on MySQL for a good reason. 
>
> The Django project is usually willing to give up quite a lot of 
> performance in 
> order to prevent data losses. I agree that this default on MySql is 
> probably 
> for a reason, but I don't think it can be a good reason for Django. 
>
> Have fun, 
> Shai. 
>
> [1] https://dev.mysql.com/doc/refman/5.7/en/innodb-consistent-read.html 
> [2] https://code.djangoproject.com/ticket/26347 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/9a0de040-4943-4a57-9153-312b30ee8524%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Enforcing a max size for form field values read into memory (review/determination of next steps needed)

2016-04-15 Thread Cristiano Coelho

I have a small concern.

The two new settings looks like will work on uploaded files count 
(multipart encoding types) and number of fields sent (url encoded 
encoding). What happens to other request types such as JSON, XML, plain 
text etc... If you are using django-rest-framework, how would the fields 
counter work?. It would be a shame if only multi part and urlencoded 
uploads would have the benefit of these checks, while still allowing json, 
xml and others still be "exploited".
Note I didn't really read the code changes completely so I'm talking with 
almost no knowledge on the proposed change.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/020adda9-93d5-4202-90b0-ac5a401ca85c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Enforcing a max size for form field values read into memory (review/determination of next steps needed)

2016-04-20 Thread Cristiano Coelho

Hi,

In particular I'm interested in this new setting: DATA_UPLOAD_MAX_MEMORY_SIZE 
[1]
that only seems to be checked against mutlparts [2] and url encoded[3] 
request bodies.

It could be good that this setting is also checked against other types 
where request.body is read directly, as you can still get the 
content-length from the body right? Please correct me if I'm wrong, but 
when in already django code all body data is always loaded in memory except 
for files and multi-part uploads which are streamed.
So JSON, XML or even plain text post requests could benefit from the 
DATA_UPLOAD_MAX_MEMORY_SIZE setting and it could be very convenient for 
example if an attacker sends a huge json, the python (at least 2.7) 
json.loads call usually crashes with an out of memory error when the string 
is too big while still creating a huge RAM spike.


[1] 
https://github.com/django/django/pull/6447/files#diff-ba8335f5987fcd81d41c28cd1879a9bfR291
[2] 
https://github.com/django/django/pull/6447/files#diff-ba8335f5987fcd81d41c28cd1879a9bfR291
[3] 
https://github.com/django/django/pull/6447/files#diff-0eb6c5000a61126731553169fddb306eR294


El martes, 19 de abril de 2016, 13:06:27 (UTC-3), Tom Christie escribió:
>
> > If you are using django-rest-framework, how would the fields counter 
> work?. It would be a shame if only multi part and urlencoded uploads would 
> have the benefit of these checks, while still allowing json, xml and others 
> still be "exploited".
> Note I didn't really read the code changes completely so I'm talking with 
> almost no knowledge on the proposed change.
>
> They wouldn't be respected by anything other than multi-part or urlencoded 
> requests.
> Tim's correct in noting that accessing `request.body` or `request.stream` 
> won't apply these checks (which is for example, what REST framework does).
>
> Even so I think this is probably a reasonable approach. We could add 
> support for respecting these settings in REST framework too, once they 
> exist.(Although I think we'd have need to have a stricter consideration of 
> backwards compat wrt. folks POSTing large amounts of JSON data)
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/c8cb7752-b7b5-483c-a896-614f41cd203b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: NumericListfilter or similar

2016-04-30 Thread Cristiano Coelho

Implementing a custom filter with an arbitrary text input is quite easy. 
All you need is a template and subclass of ListFilter. However I agree that 
it could be great that it comes already as an option by django since 
ListFilter and FieldFilter are usually not enough.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/8846a553-14e5-4e11-8b8d-f72c696512f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Table Locks and bulk creating inherited models

2016-05-03 Thread Cristiano Coelho

In my opinion SELECT ... FOR UPDATE is already quite powerful to add locks. 
Might not be as good as a straight table lock, but gives you enough power 
to lock by rows (select some indexed columns, most dbs will lock only rows 
matching them) or select by a non indexed column which will probably end up 
in a whole table lock.
As said above, dbs have many vendor specific lock types that might get 
quite complicated to abstract without limiting functionallity.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/a18aa678-8617-4c0d-a847-a0f7cc67f718%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Tracking/logging bruteforcing, especially on admin accounts?

2016-05-19 Thread Cristiano Coelho

IP based throttling like django-rest-framework would be ideal! I know there 
are some 3rd party libraries that tries to add ip based throttling to 
django although not as cool as drf ones.

El jueves, 19 de mayo de 2016, 8:11:27 (UTC-3), Vaibhav Mallya escribió:
>
> Hi everyone,
>
> I've been having a great chat with @jacobian here about potential security 
> improvements to the Django admin UI: 
> https://gist.github.com/mallyvai/bcb0bb827d6d53212879dff23cf15d03
>
> The admin UI is core to Django's value-prop, and it seems undersecured by 
> modern standards. I wanted to raise the possibility on this list of 
> guarding against brute-force login attempts on admin user accounts.
>
> You could imagine tracking / throttle in two main ways.
>
>
> 1. By originating IP
> 2. By number of login attempts on a user account
>
>
> In my view, #2 would be the best starting point - there are going to be a 
> relatively small number of admin accounts on the average Django site. Ergo, 
> focused brute-forcing or spearfishing seems to be a greater threat than 
> getting into an admin account via a lot of scanning.
>
> I am proposing a solution broken into two parts - tracking and enforcing.
>
> Tracking - End goal would be logging SuspiciousOperation if appropriate 
> thresholds were crossed. We’d need to store server-side state. I don’t 
> believe we don’t have a clean heap data structure across all DBs that the 
> Django ORM supports, but we could, say, keep two additional columns on each 
> user object: last_login_attempt_window_start, and 
> num_login_attepts_on_window_start, and checking / updating both on any / 
> all login attempts. Alternatively, simply serializing a Python heap-list 
> for each user may work.
>
> Or we can simply leverage the cache backend to store state.
>
> Enforcing - Rejecting login attempts [on any basis] is probably not a good 
> idea for a default - we can’t guarantee we don’t introduce some other 
> DoS-style attack vector. But there are some NIST/etc guidelines around, 
> say, forcing pauses between login attempts, exponential backoff, forcing 
> email-distributed tokens to be used, etc.
>
> We’re already storing custom auth/session information for the Django user 
> model, so storing state/migrations/etc somewhere wouldn’t be too much of a 
> departure.
>
> Thanks!
> -Vaibhav
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f3619d7d-99e6-4061-9e5d-dff9e8f423cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Threads and db connection handling question

2016-06-01 Thread Cristiano Coelho

Let me start saying sorry if this actually belongs to django-users rather
than developers.

I'm curious about (and if someone can point me to the code) how exactly
django handles database connections (in particular when persistent
connections are used). As I can tell from the docs, they are kept by thread
and opened and closed (or returned to what ever pool it uses if persistent
connections are used) per request.

Now what happens when you use either a new thread, or something like
python's thread pool (either through the new python 3 api or old python 2
multiprocessing.pool.ThreadPool class)? It seems like connections are
correctly opened, and commited if any data modification query is executed,
but it also seems like they are never returned/closed, which is not bad in
the case of a thread pool, as you know that thread will want to have that
connection up for as long as it lives.
What happens exactly if the thread / thread pool dies? On postgres at least
(with django 1.9.5) it seems like the connection is returned/closed in
cases the whole app server is restarted, but might be left open if the
thread unexpectly dies.

With postgres I have been experiencing some issues with connections
leaking, my app uses some thread pools that are basically started with
django. Now I can't really find the source of the leak, as the connections
are correctly closed if I restart the machine (I'm using amazon cloud
services), and it seems that they are also correctly closed on app updates
which basically means restarting Apache, but in some very specific cases,
those thread pools ends up leaking the connection.

Does django have any code to listen to thread exit and gracefully close the
connection held by it? Also, is there any chance that a connection may leak
if the server is restarted before a request is finished? As it seems like
django returns the connection only after a request is over on those cases.

Also, if the connection gets corrupted/closed by the server, does django re
try to open it or is that thread's connection dead for ever and basically
the thread unusable?

There's really not a lot of documentation on what happens when you use
django's ORM on threads that are not part of the current request, hopefully
I can get pointed to some code or docs about this.

There's a good response here
http://stackoverflow.com/questions/1303654/threaded-django-task-doesnt-automatically-handle-transactions-or-db-connections

about some issues with threads and django connections but it seems old.

--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/9c330b4b-ebd7-4abb-b03d-dffa21d245af%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Threads and db connection handling question

2016-06-01 Thread Cristiano Coelho

That's pretty close but on a much more difficult level, since it is about 
multi processing? Things starts to get very odd with multi processing and 
django, compared to threads that you can easily launch a new thread or pool 
without any special work, it is just the database connections handling that 
has some kind possible connection leak that I'm trying to figure out why. 
Also the stack overflow post I linked has some good explanation on the 
django db connection handling but is quite outdated and things probably 
have changed a bit and it isn't clear if there's any kind of connection 
handling on thread exit or you need to manually close any connection.

El miércoles, 1 de junio de 2016, 22:41:42 (UTC-3), Tim Graham escribió:
>
> Here's a ticket requesting the documentation you seek (as far as I 
> understand):
> https://code.djangoproject.com/ticket/20562
>
> On Wednesday, June 1, 2016 at 6:34:05 PM UTC-4, Cristiano Coelho wrote:
>>
>> Let me start saying sorry if this actually belongs to django-users rather 
>> than developers.
>>
>> I'm curious about (and if someone can point me to the code) how exactly 
>> django handles database connections (in particular when persistent 
>> connections are used). As I can tell from the docs, they are kept by thread 
>> and opened and closed (or returned to what ever pool it uses if persistent 
>> connections are used) per request.
>>
>> Now what happens when you use either a new thread, or something like 
>> python's thread pool (either through the new python 3 api or old python 2 
>> multiprocessing.pool.ThreadPool class)? It seems like connections are 
>> correctly opened, and commited if any data modification query is executed, 
>> but it also seems like they are never returned/closed, which is not bad in 
>> the case of a thread pool, as you know that thread will want to have that 
>> connection up for as long as it lives.
>> What happens exactly if the thread / thread pool dies? On postgres at 
>> least (with django 1.9.5) it seems like the connection is returned/closed 
>> in cases the whole app server is restarted, but might be left open if the 
>> thread unexpectly dies.
>>
>> With postgres I have been experiencing some issues with connections 
>> leaking, my app uses some thread pools that are basically started with 
>> django. Now I can't really find the source of the leak, as the connections 
>> are correctly closed if I restart the machine (I'm using amazon cloud 
>> services), and it seems that they are also correctly closed on app updates 
>> which basically means restarting Apache, but in some very specific cases, 
>> those thread pools ends up leaking the connection.
>>
>> Does django have any code to listen to thread exit and gracefully close 
>> the connection held by it? Also, is there any chance that a connection may 
>> leak if the server is restarted before a request is finished? As it seems 
>> like django returns the connection only after a request is over on those 
>> cases.
>>
>> Also, if the connection gets corrupted/closed by the server, does django 
>> re try to open it or is that thread's connection dead for ever and 
>> basically the thread unusable?
>>
>> There's really not a lot of documentation on what happens when you use 
>> django's ORM on threads that are not part of the current request, hopefully 
>> I can get pointed to some code or docs about this.
>>
>> There's a good response here 
>> http://stackoverflow.com/questions/1303654/threaded-django-task-doesnt-automatically-handle-transactions-or-db-connections
>>  
>> about some issues with threads and django connections but it seems old.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/97c0b2ae-d208-4a4e-9c5e-16ec62d328e4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Threads and db connection handling question

2016-06-01 Thread Cristiano Coelho

Yes, mainly some things that needs to be done async and you don't want to 
keep up the request for those such as email sending or logging. So this 
work is just offloaded to a thread pool and the request returns instantly. 
Also when you are using auto scaling services from amazon and since each 
thread pool is started per server process, it means your pools can also 
automatically scale, you don't really want to get into all the issues of 
configuring celery and messages queues and such just for this.

El miércoles, 1 de junio de 2016, 23:19:29 (UTC-3), Stephen Butler escribió:
>
> Is there a good reason to do this with your own custom thread pool 
> management inside Django and (I'm assuming) WSGI? Celery is a well 
> understood solution to the problem of background tasks and has a really 
> nice API.
>
> On Wed, Jun 1, 2016 at 5:34 PM, Cristiano Coelho <cristia...@gmail.com 
> > wrote:
>
>> Let me start saying sorry if this actually belongs to django-users rather 
>> than developers.
>>
>> I'm curious about (and if someone can point me to the code) how exactly 
>> django handles database connections (in particular when persistent 
>> connections are used). As I can tell from the docs, they are kept by thread 
>> and opened and closed (or returned to what ever pool it uses if persistent 
>> connections are used) per request.
>>
>> Now what happens when you use either a new thread, or something like 
>> python's thread pool (either through the new python 3 api or old python 2 
>> multiprocessing.pool.ThreadPool class)? It seems like connections are 
>> correctly opened, and commited if any data modification query is executed, 
>> but it also seems like they are never returned/closed, which is not bad in 
>> the case of a thread pool, as you know that thread will want to have that 
>> connection up for as long as it lives.
>> What happens exactly if the thread / thread pool dies? On postgres at 
>> least (with django 1.9.5) it seems like the connection is returned/closed 
>> in cases the whole app server is restarted, but might be left open if the 
>> thread unexpectly dies.
>>
>> With postgres I have been experiencing some issues with connections 
>> leaking, my app uses some thread pools that are basically started with 
>> django. Now I can't really find the source of the leak, as the connections 
>> are correctly closed if I restart the machine (I'm using amazon cloud 
>> services), and it seems that they are also correctly closed on app updates 
>> which basically means restarting Apache, but in some very specific cases, 
>> those thread pools ends up leaking the connection.
>>
>> Does django have any code to listen to thread exit and gracefully close 
>> the connection held by it? Also, is there any chance that a connection may 
>> leak if the server is restarted before a request is finished? As it seems 
>> like django returns the connection only after a request is over on those 
>> cases.
>>
>> Also, if the connection gets corrupted/closed by the server, does django 
>> re try to open it or is that thread's connection dead for ever and 
>> basically the thread unusable?
>>
>> There's really not a lot of documentation on what happens when you use 
>> django's ORM on threads that are not part of the current request, hopefully 
>> I can get pointed to some code or docs about this.
>>
>> There's a good response here 
>> http://stackoverflow.com/questions/1303654/threaded-django-task-doesnt-automatically-handle-transactions-or-db-connections
>>  
>> about some issues with threads and django connections but it seems old.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-develop...@googlegroups.com .
>> To post to this group, send email to django-d...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/django-developers.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-developers/9c330b4b-ebd7-4abb-b03d-dffa21d245af%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/django-developers/9c330b4b-ebd7-4abb-b03d-dffa21d245af%40googlegroups.com?utm_medium=email_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscrib

Re: Threads and db connection handling question

2016-06-02 Thread Cristiano Coelho

So what was stated on the stack overflow post that connections are somehow 
closed only at the end of a request through the request end signal is still 
the actual behavior?

Any best / suggested practices on how to handle connections on threads that 
are not part of the request cycle? Considering the connections are 
automatically opened for you, it would be great for them to be 
automatically closed/disposed for you on thread's death, which right now 
seems to happen some times, and some times not, leaking connections 
(something I'm trying to figure out what's going on).

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/36c63148-a6c1-44f2-b862-57a1cb750c01%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Threads and db connection handling question

2016-06-02 Thread Cristiano Coelho


El jueves, 2 de junio de 2016, 11:48:33 (UTC-3), Florian Apolloner escribió:
>
>
> No it would not be great at all, connections could theoretically shared 
> between threads etc… In general Django has no way of knowing when you want 
> to close it. In the end a "dying" thread* which is not properly closed is 
> a bug in your code anyways.*
>
> Cheers,
> Florian
>  
>

Not always, for example, on amazon elastic beasntalk when you either 
restart the app server or upload a new version, it basically kills apache 
and all WSGI processes through a sigterm, so those thread pools are 
probably killed in a bad way and you don't really have control over that. 
Also you don't really have control on the life of a thread pool thread, so 
a given thread could be gracefully stopped by the pool implementation but 
you can't really do any cleanup code before it happens for that thread (at 
least not that I'm aware of for multiprocessing.pool.ThreadPool)

As ayneric pointed out, it seems like those connections are correctly 
closed most of the time when a thread dies, but for some reason, postgres 
would keep some connections opened. Are there any rare cases where even if 
the thread is stopped the connection won't be closed? The only thing I can 
think of are that those threads are never garbage collected or something.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/797440be-cc41-44ac-b382-10dd5ea2cb5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Threads and db connection handling question

2016-06-02 Thread Cristiano Coelho

Florian, 

Sorry about the SIGTERM and SIGKILL confusion, I think I read somewhere 
some time ago that SIGTERM would instantly finish any pending request, so I 
assumed it would also kill any thread in not a really nice way. However now 
that you mention it, there's one SIGKILL from the apache logs (compared to 
the thousands of sigterm due to restarts). However the connections that 
were somehow stuck and never close dated from about 2 weeks ago, yes, there 
were connections that were opened 2 weeks ago and never closed, even if 
apache was restarded many times every day!

About thread pool, I'm talking about python's thread pool I'm using to 
offload work, not any django's pool, and these pools are the ones I have no 
control over its threads as they are completely managed by the thread pool 
library.

3 days has passed since I noticed those hanged out connections and the 
issue didn't repeat again yet, maybe it was some really odd condition that 
caused them, but the thread pool's threads connections are indeed being 
correctly closed on servers restart, so a very odd case created those 
hanged connections.

So just to be sure, is SIGTERM actually propagated to python code so it can 
gracefully kill all threads, garbage collect and close connections? Would a 
SIGKILL actually prevent any kind of cleanup leaving a chance for 
python/django leave some connections opened?

Maybe this is a postgres issue instead that happened for some very odd 
reason.

Finally, would it be possible through any kind of callbacks of the thread 
local object to fire a connection close before a thread dies? This would 
certainly help rather than waiting for the connection to get garbage 
collected. You mentioned that connections could end up being shared by 
threads, but I don't see that being something being done in django at all.

El jueves, 2 de junio de 2016, 19:32:15 (UTC-3), Florian Apolloner escribió:
>
> On Thursday, June 2, 2016 at 11:55:41 PM UTC+2, Cristiano Coelho wrote:
>>
>> Not always, for example, on amazon elastic beasntalk when you either 
>> restart the app server or upload a new version, it basically kills apache 
>> and all WSGI processes through a sigterm
>>
>
> A SIGTERM is a normal signal an should cause a proper shutdown.
>  
>
>> so those thread pools are probably killed in a bad way and you don't 
>> really have control over that.
>>
>
> Absolutely not, you are mixing up SIGTERM and SIGKILL.
>  
>
>> Also you don't really have control on the life of a thread pool thread, 
>> so a given thread could be gracefully stopped by the pool implementation
>>
>
> Once again: there is no pool. 
>
> As ayneric pointed out, it seems like those connections are correctly 
>> closed most of the time when a thread dies, but for some reason, postgres 
>> would keep some connections opened.
>>
>
> If a connection is closed properly, postgres will close it accordingly. 
> The only way possible for a connection to stay open while the app is gone 
> is that you are running into tcp timeouts while getting killed with SIGTERM 
> (dunno if the postgres protocol has keep alive support on the protocol 
> level, most likely not). As long as you are not sending a SIGTERM, python 
> should clean up properly which should call garbage collection, which then 
> again should delete all connections and therefore close the connection. Any 
> other behavior seems like a bug in Python (or maybe Django, but as long as 
> Python shuts down properly I think we are fine).
>
> Are there any rare cases where even if the thread is stopped the 
>> connection won't be closed? The only thing I can think of are that those 
>> threads are never garbage collected or something.
>>
>
> Depends on the python version you are using, especially thread local 
> behavior changed a lot…
>
> Cheers,
> Florian 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/562bf2f9-8a44-495c-bacd-9a42012aa011%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Threads and db connection handling question

2016-06-02 Thread Cristiano Coelho

I'm not starting threads by hand but rather a pool which handles any 
threads for me, I basically just send a function to the pool and leave it 
run.
You are right, I could wrap every function sent to the pool with the code 
you proposed, but I also don't want to open and close a connection on every 
function call, but instead only when the thread from the pool is no longer 
required and disposed, pretty much on application exist, although the pool 
handler can do what ever it wants with the threads.

El jueves, 2 de junio de 2016, 20:22:21 (UTC-3), Stephen Butler escribió:
>
> I'm still a bit confused. What is the advantage of having connections 
> closed automatically when the thread exits? It seems to me that you can 
> quickly solve your problem by modifying your thread start routines:
>
> from django.db import connection
> from contextlib import closing
>
> def my_thread_start():
> with closing(connection):
> # do normal work
>
> You can even create a quick decorator if that's too much modification.
>
> On Thu, Jun 2, 2016 at 5:48 PM, Cristiano Coelho <cristia...@gmail.com 
> > wrote:
>
>> Florian, 
>>
>> Sorry about the SIGTERM and SIGKILL confusion, I think I read somewhere 
>> some time ago that SIGTERM would instantly finish any pending request, so I 
>> assumed it would also kill any thread in not a really nice way. However now 
>> that you mention it, there's one SIGKILL from the apache logs (compared to 
>> the thousands of sigterm due to restarts). However the connections that 
>> were somehow stuck and never close dated from about 2 weeks ago, yes, there 
>> were connections that were opened 2 weeks ago and never closed, even if 
>> apache was restarded many times every day!
>>
>> About thread pool, I'm talking about python's thread pool I'm using to 
>> offload work, not any django's pool, and these pools are the ones I have no 
>> control over its threads as they are completely managed by the thread pool 
>> library.
>>
>> 3 days has passed since I noticed those hanged out connections and the 
>> issue didn't repeat again yet, maybe it was some really odd condition that 
>> caused them, but the thread pool's threads connections are indeed being 
>> correctly closed on servers restart, so a very odd case created those 
>> hanged connections.
>>
>> So just to be sure, is SIGTERM actually propagated to python code so it 
>> can gracefully kill all threads, garbage collect and close connections? 
>> Would a SIGKILL actually prevent any kind of cleanup leaving a chance for 
>> python/django leave some connections opened?
>>
>> Maybe this is a postgres issue instead that happened for some very odd 
>> reason.
>>
>>
>> Finally, would it be possible through any kind of callbacks of the thread 
>> local object to fire a connection close before a thread dies? This would 
>> certainly help rather than waiting for the connection to get garbage 
>> collected. You mentioned that connections could end up being shared by 
>> threads, but I don't see that being something being done in django at all.
>>
>>
>> El jueves, 2 de junio de 2016, 19:32:15 (UTC-3), Florian Apolloner 
>> escribió:
>>>
>>> On Thursday, June 2, 2016 at 11:55:41 PM UTC+2, Cristiano Coelho wrote:
>>>>
>>>> Not always, for example, on amazon elastic beasntalk when you either 
>>>> restart the app server or upload a new version, it basically kills apache 
>>>> and all WSGI processes through a sigterm
>>>>
>>>
>>> A SIGTERM is a normal signal an should cause a proper shutdown.
>>>  
>>>
>>>> so those thread pools are probably killed in a bad way and you don't 
>>>> really have control over that.
>>>>
>>>
>>> Absolutely not, you are mixing up SIGTERM and SIGKILL.
>>>  
>>>
>>>> Also you don't really have control on the life of a thread pool thread, 
>>>> so a given thread could be gracefully stopped by the pool implementation
>>>>
>>>
>>> Once again: there is no pool. 
>>>
>>> As ayneric pointed out, it seems like those connections are correctly 
>>>> closed most of the time when a thread dies, but for some reason, postgres 
>>>> would keep some connections opened.
>>>>
>>>
>>> If a connection is closed properly, postgres will close it accordingly. 
>>> The only way possible for a connection to stay open while the app is gone 
>>> is that you are running into tcp timeouts while getting killed with SIGTERM 
>

Re: Threads and db connection handling question

2016-06-02 Thread Cristiano Coelho

Some of the pools might have some high load (such as the one that handles 
logging to the database and other sources) so opening and closing a 
connection for each call might end up bad.

Now that you mention django code, does 
connection.close_if_unusable_or_obsolete() always close the connection, or 
does it also handle the case where persistent connections are used (and so 
the connection is not closed if it is alive and in good state) ? If so, 
would it possible to simply replicate what django does on every request 
start and end as a wrapper of every function that's sent to the pool? That 
way connections are re used if possible and recycled/closed if they go bad.

Looking at the code this seems to be called on every request start and end.

def close_old_connections(**kwargs):
for conn in connections.all():
conn.close_if_unusable_or_obsolete()
signals.request_started.connect(close_old_connections)
signals.request_finished.connect(close_old_connections)

I guess something could be done but just with that thread's connection, 
then all functions sent to the pool will need to be sent through a wrapper 
that does this before and after every call.

El jueves, 2 de junio de 2016, 20:50:13 (UTC-3), Stephen Butler escribió:
>
> Do you expect your background threads to be equivalent to or greater than 
> the number of requests you're normally servicing? Usually background tasks 
> are much less frequent than the web requests, so a little overhead w/r/t 
> database connections isn't even going to be noticed.
>
> Looking at what Django does, at the start and end of each request it calls 
> connection.close_if_unusable_or_obsolete(). That function does careful 
> checks to see if the connection is even worth using. Unless you do 
> something similar in your thread_start (adding more complication than I've 
> suggested), having a TLS connection will cause more problems than it saves 
> you. To make this work in general you'd at least need a hook at the point 
> the thread is removed from and added back to the pool, not when the thread 
> exits.
>
> Also, the connection won't be opened unless you actually do something that 
> needs it.
>
> Personally, I think this sounds like something you're trying to optimize 
> before you've profiled that the benefit it worth it.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3d668979-fae3-4a77-b9b4-fe9e09f763fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Threads and db connection handling question

2016-06-03 Thread Cristiano Coelho

Aymeric, I have never said anything about connection pool, I'm talking 
about thread pooling to perform async work (without the need to spawn a new 
thread every time and have control over the amount data that is offloaded) 
and the behaviour of django connections when used on a separate thread 
that's not part of the request/response cycle, since if you don't take 
care, everything will work as expected because django is nice enough to 
open db connections for you even if it is from a user spawned thread, but 
it won't close them for you, so when working with user spawned threads 
special care needs to be taken on the db connections.

I have added changes to the thread's pool code so it always wraps functions 
between connection.close_if_unusable_or_obsolete()  so the thread pool 
threads can always have a healthy connection and hopefully close invalid 
ones. I don't know if this is related to the connection leak I got that I 
can't seem to reproduce now but I will be keepin an eye.

It would be good to have a big warning on the django docs about doing 
queries on threads that are spawned by the user and are not part of a 
request/response cycle, leting the users know they have to explicitly close 
them.

El viernes, 3 de junio de 2016, 10:01:38 (UTC-3), Aymeric Augustin escribió:
>
> Hello,
>
> I have to say that, as the author of the “persistent connections” feature, 
> I am confused by repeated references to a “connection pool” in this 
> discussion. I chose *not* to implement a connection pool because these 
> aren’t easy to get right. Users who need a connection pool are better off 
> with a third-party option such as pgpool.
>
> When persistent connections are enabled, each thread uses a persistent 
> connection to the database — as opposed to one connection per HTTP request. 
> That said connections aren’t shared or pooled between threads. This 
> guarantees that each connection dies when the thread that opened it dies.
>
> In practice, Django opens a database connection per thread and keeps it 
> open after each request. When the next request comes in, if the connection 
> still works (this is tested with something like “SELECT 1”) and its maximum 
> age isn’t reached, Django re-uses it; otherwise it closes it and opens a 
> new one. This is what the “close_if_unusable_or_obsolete” function does — 
> as the name says :-)
>
> I hope this helps,
>
> -- 
> Aymeric.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/78b396c7-3085-416a-9fa9-568e4aedf836%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Adding a database-agnostic JSONField into Django

2016-06-24 Thread Cristiano Coelho

I would like it. I honestly one use JsonField to store json data, not 
really to query it, and having postgres store it a very efficient way is a 
very nice plus compared to plain text storage. Then the postgres features 
to query json data are godlike but I wouldn't mind trading something to 
allow current json field work with every database, mostly so I can still 
develop on SQLite while running postgres on production.

El jueves, 23 de junio de 2016, 6:57:07 (UTC-3), Raphael Hertzog escribió:
>
> Hello, 
>
> in almost all projects I work on, I end up using a JSONField. Since 
> I value being able to run with any database, I'm not relying on 
> django.contrib.postgres.fields.JSONField. So I have been using 
> pypi's django-jsonfield maintained by Matthew Schinckel: 
> https://bitbucket.org/schinckel/django-jsonfield 
> (I have also packaged this for Debian) 
>
> I have recently discovered pypi's "jsonfield" maintained by Brad Jasper: 
> https://github.com/bradjasper/django-jsonfield 
>
> Both projects are very similar (and use the same python package name) and 
> both projects are actually looking for a new maintainer... since I rely on 
> something like this, I would be willing to try to merge the best of both 
> modules into a possible django.contrib.jsonfield or directly into the 
> core. 
>
> We could use this opportunity to let the newly-integrated field use 
> DjangoJSONEncoder by default (see recent discussion about this) and 
> django.contrib.postgres could register its additionals lookups into the 
> generic field (assuming we use "jsonb" as underlying type for postgresql). 
>
> What do you think of this? 
>
> If inclusion into Django is not desired, then maybe we could aim to 
> at least merge both of those projects in a single "blessed" third-party 
> module that could be maintained in 
> https://github.com/django/django-jsonfield? 
>
> Cheers, 
> -- 
> Raphaël Hertzog ◈ Writer/Consultant ◈ Debian Developer 
>
> Discover the Debian Administrator's Handbook: 
> → http://debian-handbook.info/get/ 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/2f10fa0f-4fe0-456b-90b0-925e273b2bad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Possible Bug in RegexURLResolver

2016-07-14 Thread Cristiano Coelho

If you are using locals it means each thread will always end up calling the 
actual populate code? Is there any point for the RegexURLResolver class to 
be a singleton then?


El jueves, 14 de julio de 2016, 9:12:54 (UTC-3), Marten Kenbeek escribió:
>
> It's not quite as simple. Concurrency is actually not the main issue here. 
> The issue is that self._populate() only populates the app_dict, 
> namespace_dict 
> and reverse_dict for the currently active language. By short-circuiting 
> if the resolver is populating, you have the chance to short-circuit while 
> the populating thread has a different active language. The short-circuiting 
> thread's language won't have been populated, and that will result in the 
> above KeyError.
>
> The issue that self._populating is solving is that a RegexURLResolver can 
> recursively include itself if a namespace is used. Namespaces are loaded 
> lazily on first access, so this would generally not result in infinite 
> recursion. But, to include all namespaced views in self._callback_strs, 
> you need to load them immediately. self._populating prevents infinite 
> recursion in that specific case.
>
> On a side note: recursively including the same resolver is possible under 
> some limited circumstances, but so far I've only seen it happen in the 
> Django test suite, and it doesn't even work if you don't include at least 
> one namespace between each recursive include. It's an edge-case scenario 
> that can be solved better by using a repeating pattern in your regex. I 
> don't think that it provides any value, but it does significantly 
> complicate the code. Removing this accidental "feature" would solve the 
> issue as well, without complicating the code any further. 
>
> Now, to solve the issue within the constraints of the current test suite, 
> you only need to prevent that self._populate() is called twice on the 
> same object within the same thread. A simple thread-local object should do 
> the trick:
>
> class RegexURLResolver(LocaleRegexProvider):
> def __init__(self, regex, urlconf_name, default_kwargs=None, 
> app_name=None, namespace=None):
> 
> ...
>
> self._local = threading.local()
> self._local.populating = False
> ...
>
>def _populate(self):
> if self._local.populating:
> return
> self._local.populating = True
> ...
> self._local.populating = False
>
>
> This will work concurrently, because all lists (lookups, namespaces, apps) 
> are built up in local variables, and then set in 
> self._namespace_dict[language_code] etc. as an atomic operation. The 
> worst that can happen is that the list is overwritten atomically with an 
> identical list. self._callback_strs is a set, so updating it with values 
> that are already present is not a problem either.
>
>
> Marten
>
>
> On Wednesday, July 13, 2016 at 12:22:36 AM UTC+2, Cristiano Coelho wrote:
>>
>> Keep in mind your code guys is semantically different from the one of the 
>> first post.
>>
>> In the first post, the _populate method can be called more than once, and 
>> if the time window is long enough, the two or more calls will eventually 
>> run the # ... populate code again, but on your code, the populate code will 
>> only be called once doesn't matter how many times you call _populate, 
>> unless the populated variable is restarted somewhere else. So I don't know 
>> how is this exactly used but make sure to check it
>>
>> El martes, 12 de julio de 2016, 14:58:12 (UTC-3), Aymeric Augustin 
>> escribió:
>>>
>>> On 12 Jul 2016, at 19:46, Florian Apolloner <f.apo...@gmail.com> wrote:
>>>
>>>
>>> On Tuesday, July 12, 2016 at 9:25:37 AM UTC+2, Aymeric Augustin wrote:
>>>>
>>>> Can you check the condition inside the lock? The following pattern 
>>>> seems simpler to me:
>>>>
>>>
>>> The standard pattern in such cases is to check inside and outside. 
>>> Outside to avoid the lock if you already populated (the majority of 
>>> requests) and inside to see if another thread populated it in the time you 
>>> waited to get the lock…
>>>
>>>
>>> Yes, actually that’s what I did the last time I implemented this 
>>> pattern, in Apps.populate.
>>>
>>> -- 
>>> Aymeric.
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/a9788779-8a04-44f1-b18b-e8048f4c6a4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Possible Bug in RegexURLResolver

2016-07-12 Thread Cristiano Coelho

Indeed that code is much simpler as long as the changed behaviour of the 
populated/populating flag doesn't break anything.

El martes, 12 de julio de 2016, 4:25:37 (UTC-3), Aymeric Augustin escribió:
>
> Hello,
>
> Can you check the condition inside the lock? The following pattern seems 
> simpler to me:
>
> def _populate(self):
> with self._lock:
> if self._populated:
> return
> # … populate …
> self._populated = True
>
> You can look at Apps.populate for an example of this pattern.
>
> Please forgive me if I missed a reason for using something more 
> complicated.
>
> self._lock needs to be a RLock if _populate can call itself recursively in 
> a given thread.
>
> Best regards,
>
> -- 
> Aymeric.
>
> On 12 Jul 2016, at 03:20, Cristiano Coelho <cristia...@gmail.com 
> > wrote:
>
> Sorry for my above answer I think the actual locks should go like this, so 
> all threads can actually go to the wait line while _populating is True.
>
> def _populate(self):
>
> self.lock.acquire()
> 
> if self._populating:
> self.lock.wait()
> self.lock.release()
> return
> 
> self._populating = True
> 
> self.lock.release()
> 
> ...
>
> self.lock.acquire()
> self._populating = False
> self.lock.notify_all()
> self.lock.release()
>
>
>
> El lunes, 11 de julio de 2016, 22:14:04 (UTC-3), Cristiano Coelho escribió:
>>
>> Wouldn't a standard Lock do the trick? Also you are still vulnerable to a 
>> race condition when reading self._populating, if the goal is to avoid 
>> populating the object more than once in a short interval (in a case where 
>> multiple requests hit the server before the object is initialized for the 
>> first time?) you are still running all the critical code on all threads if 
>> they check self.populating before it is set to True.
>>
>> Would a condition work better in this case? Something like this.. (add 
>> any missing try/finally/with that might be required), warning, NOT TESTED, 
>> just an idea.
>> The below code will not be prone to race conditions as the variables are 
>> always read/set under a lock, and also the critical section code will be 
>> only executed ONCE even if multiple threads attempt to run it at once, 
>> while still locking all threads to prevent returning before the code is 
>> done.
>>
>> def __init__(self, regex, urlconf_name, default_kwargs=None, app_name=
>> None, namespace=None):
>> 
>> ...
>>
>> self._populating = False
>> self.lock = threading.Condition()
>>
>> def _populate(self):
>>
>> self.lock.acquire()
>> 
>> if self._populating:
>> self.lock.wait()
>> self.lock.release()
>> return
>> 
>> self._populating = True
>>
>> ...
>>
>> self._populating = False
>> self.lock.notify_all()
>> self.lock.release()
>>
>>
>>
>>
>>
>> El lunes, 11 de julio de 2016, 10:08:50 (UTC-3), a.br...@rataran.com 
>> escribió:
>>>
>>> Hello everyone.
>>>
>>> As suggested by Markus on django-users group, I'm posting this here too.
>>>
>>> ---
>>>
>>> I'm using django (1, 10, 0, u'beta', 1).
>>>
>>> When I try to reverse url in shell everything goes fine.
>>>
>>> When under nginx/uwsgi with many concurrent request I get 
>>>
>>> ... /local/lib/python2.7/site-packages/django/urls/resolvers.py", line 
>>> 241, in reverse_dict
>>> return self._reverse_dict[language_code]
>>> KeyError: 'it'
>>>
>>> After a wile I figured out that RegexURLResolver is memoized by 
>>> get_resolver and so it acts like a singleton for a certain number of 
>>> requests.
>>>
>>> Analyzing the code of  RegexURLResolver I found that the method 
>>> _poupulate will return directly if it has been called before and not yet 
>>> finished.
>>>
>>> ...
>>> def _populate(self):
>>> if self._populating:
>>> return
>>> self._populating = True
>>> ...  
>>>
>>> if used for recursive call in a single thread this will not hurt, but in 
>>> my case in uwsgi multi thread mode I got the error.
>>>
>>> here is my quick and dirty fix:
>>>
>>> class RegexURLResolver(LocaleRegexProvider):
>&

Re: Admin-Actions also in the object details view

2016-07-11 Thread Cristiano Coelho

Thanks, I guess that can do it for now. It's a shame the one who started 
implementing this in django itself just abandoned it :(

El lunes, 11 de julio de 2016, 8:55:17 (UTC-3), Alex Riina escribió:
>
> Here's an implementation:
>
> https://github.com/crccheck/django-object-actions
>
> Combining with other admin plugins that change the template requires 
> overriding the template to fit both changes in.
>
> The actions are GETs and have no card protection.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/fbd1acfe-3a68-4b6e-90b2-bf8a42f0e2a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Possible Bug in RegexURLResolver

2016-07-11 Thread Cristiano Coelho

Wouldn't a standard Lock do the trick? Also you are still vulnerable to a 
race condition when reading self._populating, if the goal is to avoid 
populating the object more than once in a short interval (in a case where 
multiple requests hit the server before the object is initialized for the 
first time?) you are still running all the critical code on all threads if 
they check self.populating before it is set to True.

Would a condition work better in this case? Something like this.. (add any 
missing try/finally/with that might be required), warning, NOT TESTED, just 
an idea.
The below code will not be prone to race conditions as the variables are 
always read/set under a lock, and also the critical section code will be 
only executed ONCE even if multiple threads attempt to run it at once, 
while still locking all threads to prevent returning before the code is 
done.

def __init__(self, regex, urlconf_name, default_kwargs=None, app_name=None, 
namespace=None):

...

self._populating = False
self.lock = threading.Condition()

def _populate(self):

self.lock.acquire()

if self._populating:
self.lock.wait()
self.lock.release()
return

self._populating = True

...

self._populating = False
self.lock.notify_all()
self.lock.release()





El lunes, 11 de julio de 2016, 10:08:50 (UTC-3), a.br...@rataran.com 
escribió:
>
> Hello everyone.
>
> As suggested by Markus on django-users group, I'm posting this here too.
>
> ---
>
> I'm using django (1, 10, 0, u'beta', 1).
>
> When I try to reverse url in shell everything goes fine.
>
> When under nginx/uwsgi with many concurrent request I get 
>
> ... /local/lib/python2.7/site-packages/django/urls/resolvers.py", line 
> 241, in reverse_dict
> return self._reverse_dict[language_code]
> KeyError: 'it'
>
> After a wile I figured out that RegexURLResolver is memoized by 
> get_resolver and so it acts like a singleton for a certain number of 
> requests.
>
> Analyzing the code of  RegexURLResolver I found that the method _poupulate 
> will return directly if it has been called before and not yet finished.
>
> ...
> def _populate(self):
> if self._populating:
> return
> self._populating = True
> ...  
>
> if used for recursive call in a single thread this will not hurt, but in 
> my case in uwsgi multi thread mode I got the error.
>
> here is my quick and dirty fix:
>
> class RegexURLResolver(LocaleRegexProvider):
> def __init__(self, regex, urlconf_name, default_kwargs=None, 
> app_name=None, namespace=None):
> 
> ...
>
> self._populating = False
> self.RLock = threading.RLock()
>
> ...
>
>def _populate(self):
> if self._populating:
> self.RLock.acquire()
> self.RLock.release()
> return
> self._populating = True
> self.RLock.acquire()
> 
> ...
>
> self._populating = False
> self.RLock.release()
>
>
> Does anyone know if there is a better solution?
>
> Thank you.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/0782172d-2648-4770-83ff-93dfd00ade17%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Possible Bug in RegexURLResolver

2016-07-11 Thread Cristiano Coelho

Sorry for my above answer I think the actual locks should go like this, so 
all threads can actually go to the wait line while _populating is True.

def _populate(self):

self.lock.acquire()

if self._populating:
self.lock.wait()
self.lock.release()
return

self._populating = True

self.lock.release()

...

self.lock.acquire()
self._populating = False
self.lock.notify_all()
self.lock.release()



El lunes, 11 de julio de 2016, 22:14:04 (UTC-3), Cristiano Coelho escribió:
>
> Wouldn't a standard Lock do the trick? Also you are still vulnerable to a 
> race condition when reading self._populating, if the goal is to avoid 
> populating the object more than once in a short interval (in a case where 
> multiple requests hit the server before the object is initialized for the 
> first time?) you are still running all the critical code on all threads if 
> they check self.populating before it is set to True.
>
> Would a condition work better in this case? Something like this.. (add any 
> missing try/finally/with that might be required), warning, NOT TESTED, just 
> an idea.
> The below code will not be prone to race conditions as the variables are 
> always read/set under a lock, and also the critical section code will be 
> only executed ONCE even if multiple threads attempt to run it at once, 
> while still locking all threads to prevent returning before the code is 
> done.
>
> def __init__(self, regex, urlconf_name, default_kwargs=None, app_name=None
> , namespace=None):
> 
> ...
>
> self._populating = False
> self.lock = threading.Condition()
>
> def _populate(self):
>
> self.lock.acquire()
> 
> if self._populating:
> self.lock.wait()
> self.lock.release()
> return
> 
> self._populating = True
>
> ...
>
> self._populating = False
> self.lock.notify_all()
> self.lock.release()
>
>
>
>
>
> El lunes, 11 de julio de 2016, 10:08:50 (UTC-3), a.br...@rataran.com 
> escribió:
>>
>> Hello everyone.
>>
>> As suggested by Markus on django-users group, I'm posting this here too.
>>
>> ---
>>
>> I'm using django (1, 10, 0, u'beta', 1).
>>
>> When I try to reverse url in shell everything goes fine.
>>
>> When under nginx/uwsgi with many concurrent request I get 
>>
>> ... /local/lib/python2.7/site-packages/django/urls/resolvers.py", line 
>> 241, in reverse_dict
>> return self._reverse_dict[language_code]
>> KeyError: 'it'
>>
>> After a wile I figured out that RegexURLResolver is memoized by 
>> get_resolver and so it acts like a singleton for a certain number of 
>> requests.
>>
>> Analyzing the code of  RegexURLResolver I found that the method 
>> _poupulate will return directly if it has been called before and not yet 
>> finished.
>>
>> ...
>> def _populate(self):
>> if self._populating:
>> return
>> self._populating = True
>> ...  
>>
>> if used for recursive call in a single thread this will not hurt, but in 
>> my case in uwsgi multi thread mode I got the error.
>>
>> here is my quick and dirty fix:
>>
>> class RegexURLResolver(LocaleRegexProvider):
>> def __init__(self, regex, urlconf_name, default_kwargs=None, 
>> app_name=None, namespace=None):
>> 
>> ...
>>
>> self._populating = False
>> self.RLock = threading.RLock()
>>
>> ...
>>
>>def _populate(self):
>> if self._populating:
>> self.RLock.acquire()
>> self.RLock.release()
>> return
>> self._populating = True
>> self.RLock.acquire()
>> 
>> ...
>>
>> self._populating = False
>> self.RLock.release()
>>
>>
>> Does anyone know if there is a better solution?
>>
>> Thank you.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/922ca727-dd60-41dd-9b78-b8d8fe9ebd16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Admin-Actions also in the object details view

2016-07-10 Thread Cristiano Coelho

Sorry to bring this up (quite a few years old already)

Are there any plans to bring this to life? The ticket seems to have died as 
well.
It could be very useful to have actions re used on the detail view page 
somehow. Right now the only option is to override the template.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/c162bcbd-24f6-43f3-a6c0-bcef864665e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Logging config tries too hard

2016-09-10 Thread Cristiano Coelho

I had troubles understanding the logging setup the first time, but after 
that, its quite simple on every project. I usually end up copy/pasting some 
console and db logger class and just add it to the config, I don't really 
think it is that difficult

El martes, 6 de septiembre de 2016, 9:57:16 (UTC-3), Ivan Sagalaev escribió:
>
> Hello everyone,
>
> It's been a while since I last posted here, please forgive if I break any 
> new rules inadvertently.
>
> I'd like to revisit a decision made in [18993][]. My use case is very 
> simple and obvious: I want all logging going into stdout.
>
> As currently implemented, I can't do it easily with a custom `LOGGING` 
> setting, because:
>
> - If I leave existing Django loggers enabled it ties me to the behavior 
> chosen by Django, which doesn't necessarily match what I want. For example, 
> Django won't log debug and info messages if `DEBUG = False`. And with 
> `DEBUG = True` I will be having two log messages for every log into the 
> 'django' logger: one from the Django's own handler, and another after it 
> propagates to my root logger.
>
> - If I disable existing Django loggers, I effectively disable all logging 
> from Django (and from 'py.warnings' for good measure).
>
> In fact, the idea of providing a default logging configuration which a 
> user can then *build upon* isn't really workable with Python logging: you 
> can either fully reuse or fully discard what's been provided, but you can't 
> meaningfully define a consistent configuration. Also, I have my doubts that 
> this "build upon" use case is based on any real demand. In my experience 
> there are only two approaches to configuring logging: "logging? huh, didn't 
> think about it" and "get your hands off my logging, I know what I'm doing!"
>
> The latter, incidentally, is what the old way was doing: define a sensible 
> default value for `LOGGING` and let users to overwrite it completely. 
> Currently, the default logging configuration is hard-coded in 
> `django.utils.log`.
>
> Also, I couldn't find the actual reasoning for the current behavior in the 
> original ticket. It starts talking about having a more useful default, not 
> about changing the way how this default configuration is applied.
>
> [18993]: https://code.djangoproject.com/ticket/18993
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f5b99bf1-ab67-412b-86e5-546d3a88dc2e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Implicit ForeignKey index and unique_together

2016-09-16 Thread Cristiano Coelho

I think that the issue on Trac is actually something different, it talks 
about the need (or not) of an index, when defining a unique constraint. 
Most databases (if not all) will create an index automatically when a 
unique constraint is defined, and correct me if I'm wrong, but PostgreSQL 
(I don't about Oracle) is the only one that actually has constraints 
(unique ones included here) and indexes as a separate thing, but for 
SQLServer and MySQL the unique constraint is just an additional option of 
the index.

What Dilyan is talking about, and correct me if I'm wrong again, is about 
the redundancy of defining an index on a foreing key, if you already have 
that column as the left-most part of an index (unique or not). Most of the 
time it will be redundant to have an index A, and another one (A,B), since 
the latter will be also used for A queries. However this is up to debate 
since using the (A,B) index can be potentially slower than using just the A 
index due to the index being bigger, but you save space and 
insert/update/delete performance for not having two different indexes.

In my case, most of the time I end up with a db_index=False on foreing keys 
that I know I have a index/unique defined somewhere else to avoid the 
overhead of the additional index.

El viernes, 16 de septiembre de 2016, 11:34:52 (UTC-3), Tim Graham escribió:
>
> Did you try to find anything related in Trac? Maybe 
> https://code.djangoproject.com/ticket/24082?
>
> I use this query in Google: postgresql unique index site:
> code.djangoproject.com
>
> On Friday, September 16, 2016 at 9:51:13 AM UTC-4, Dilyan Palauzov wrote:
>>
>> Hello, 
>>
>> according to the documentation models.ForeignKeys creates implicitly an 
>> index on the underlying database. 
>>
>> Wouldn't it be reasonable to change the default behaviour to only create 
>> implicit index, if there is no index_together or unique_together starting 
>> with the name of the foreign key?   In such cases the implicit index is 
>> redundant, at least with Postgresql, as the value can be found fast using 
>> the _together index. 
>>
>> Greetings 
>>Dilian 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3200c618-5665-4c9e-8255-ef34da22aef1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Adding UNION/INTERSECT/EXCEPT to the ORM

2016-12-26 Thread Cristiano Coelho

Is this going to be different from the pipe ( | ) and and ( & ) operators 
on querysets? If I'm not wrong those can already result in a union query 
(but not necessary, sometimes it just returns a query with an or/and 
condition)

El viernes, 23 de diciembre de 2016, 11:12:40 (UTC-3), Florian Apolloner 
escribió:
>
> Hi,
>
> I have a currently WIP PR at https://github.com/django/django/pull/7727
>
> The usage is currently something like this:
>
> qs1 = User.objects.all().values('username')
> qs2 = Group.objects.all().values('name')
> results = qs1.union(qs).distinct().order_by('name')[:10]
>
> (order_by does not work though yet)
>
> So far I have a few questions:
>
>  * Should .union/.intersect etc return a new type of queryset or stay with 
> the base QuerySet class (probably less code duplication)
>  * We currently have a few methods which check can_filter and error out 
> accordingly (ie you cannot filter after slicing), but as the error message 
> in 
> https://github.com/django/django/blob/master/django/db/models/query.py#L579 
> shows, this strongly relies on knowledge of the implementation of the 
> filter. For combined querysets I basically need to limit everything aside 
> from order by and limit/offset. Would a method like this make some sense 
> (on the Query class):
>
> def is_allowed(self, action):
>   if self.combinatorial and action not in ('set_operation', 'order_by', 
> 'slicing'):
> raise SomeError('Cannot use this method on an combinatorial queryset')
>   elif action == 'filter' and (self.low_mark or self.high_mark):
> raise SomeError('Cannot filter after slicing')
>
>  * set_operator in base.py feels a bit weird (
> https://github.com/django/django/pull/7727/files#diff-53fcf3ac0535307033e0cfabb85c5301)
>  
> -- any better options?
>  * How can I change the generated order_by clause to reference the columns 
> "unqualified" (ie without table name), can I somehow just realias every 
> column? 
>
> Cheers,
> Florian
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d38358ca-c97f-4bb0-a390-e38f3b4b8f6c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Automatic prefetching in querysets

2017-08-15 Thread Cristiano Coelho

I would rather have warnings as well, adding more magical behavior is bad 
and might even degrade performance on some cases, automatically selecting a 
bunch of data that "might" be used is bad, and specially considering how 
slow python is, accidentally loading/building 1k+ objects when maybe only 
one of them is used would be as bad as doing 1k+ queries.

If the systems you are building are that large and complicated you can't 
have people with 0 SQL knowledge doing stuff neither! So many things to 
tweak, indexes, data denormalization, proper joins here and there, unique 
constraints, locks and race conditions, someone attempting to code 
something that's not a blog or hello world really needs to know a bit about 
all of that.


El martes, 15 de agosto de 2017, 6:44:19 (UTC-3), Gordon Wrigley escribió:
>
> I'd like to discuss automatic prefetching in querysets. Specifically 
> automatically doing prefetch_related where needed without the user having 
> to request it.
>
> For context consider these three snippets using the Question & Choice 
> models from the tutorial 
>  
> when 
> there are 100 questions each with 5 choices for a total of 500 choices.
>
> Default
> for choice in Choice.objects.all():
> print(choice.question.question_text, ':', choice.choice_text)
> 501 db queries, fetches 500 choice rows and 500 question rows from the DB
>
> Prefetch_related
> for choice in Choice.objects.prefetch_related('question'):
> print(choice.question.question_text, ':', choice.choice_text)
> 2 db queries, fetches 500 choice rows and 100 question rows from the DB
>
> Select_related
> for choice in Choice.objects.select_related('question'):
> print(choice.question.question_text, ':', choice.choice_text)
> 1 db query, fetches 500 choice rows and 500 question rows from the DB
>
> I've included select_related for completeness, I'm not going to propose 
> changing anything about it's use. There are places where it is the best 
> choice and in those places it will still be up to the user to request it. I 
> will note that anywhere select_related is optimal prefetch_related is still 
> better than the default and leave it at that.
>
> The 'Default' example above is a classic example of the N+1 query problem, 
> a problem that is widespread in Django apps.
> This pattern of queries is what new users produce because they don't know 
> enough about the database and / or ORM to do otherwise.
> Experieced users will also often produce this because it's not always 
> obvious what fields will and won't be used and subsequently what should be 
> prefetched.
> Additionally that list will change over time. A small change to a template 
> to display an extra field can result in a denial of service on your DB due 
> to a missing prefetch.
> Identifying missing prefetches is fiddly, time consuming and error prone. 
> Tools like django-perf-rec  
> (which I was involved in creating) and nplusone 
>  exist in part to flag missing 
> prefetches introduced by changed code.
> Finally libraries like Django Rest Framework and the Admin will also 
> produce queries like this because it's very difficult for them to know what 
> needs prefetching without being explicitly told by an experienced user.
>
> As hinted at the top I'd like to propose changing Django so the default 
> code behaves like the prefetch_related code.
> Longer term I think this should be the default behaviour but obviously it 
> needs to be proved first so for now I'd suggest a new queryset function 
> that enables this behaviour.
>
> I have a proof of concept of this mechanism that I've used successfully in 
> production. I'm not posting it yet because I'd like to focus on desired 
> behavior rather than implementation details. But in summary, what it does 
> is when accessing a missing field on a model, rather than fetching it just 
> for that instance, it runs a prefetch_related query to fetch it for all 
> peer instances that were fetched in the same queryset. So in the example 
> above it prefetches all Questions in one query.
>
> This might seem like a risky thing to do but I'd argue that it really 
> isn't.
> The only time this isn't superior to the default case is when you are post 
> filtering the queryset results in Python.
> Even in that case it's only inferior if you started with a large number of 
> results, filtered basically all of them and the code is structured so that 
> the filtered ones aren't garbage collected.
> To cover this rare case the automatic prefetching can easily be disabled 
> on a per queryset or per object basis. Leaving us with a rare downside that 
> can easily be manually resolved in exchange for a significant general 
> improvement.
>
> In practice this thing is almost magical to work with. Unless you already 
> have extensive and tightly maintained prefetches everywhere

Re: Should django File wrapper support .next()?

2017-08-16 Thread Cristiano Coelho

I forgot about django dropping support for Python2 so I guess you are right 
this doesn't make much sense if next() is removed, but for Python2 it would 
make sense for django's File object to have the same api as the object 
returned by open().


El martes, 15 de agosto de 2017, 19:21:08 (UTC-3), Adam Johnson escribió:
>
> The next() method is gone from the return value of open() in Python 3 and 
> I don't think it was every necessary. As per the iterator protcool ( 
> https://docs.python.org/2/library/stdtypes.html#iterator-types ), __iter__ 
> is meant to return an iterator over the object, and *that* is what the 
> next() method - __next__ on Python 3 - should be attached to. Django's 
> File.__iter__ is a generator function, thus it returns a generator, with 
> comes with a next()/__next__().
>
> I think you should just try iterating your files by calling iter() on 
> them first, it's Python 3 compatible too.
>
> On 27 July 2017 at 17:24, Cristiano Coelho <cristia...@gmail.com 
> > wrote:
>
>> Hello,
>>
>> I have recently found an interesting issue, using a project that relies 
>> on different storage backends, when switching from a custom one to django's 
>> file system storage, I found that existing code that would iterate files 
>> with the next() call would start to fail, since although django's File is 
>> iterable, it doesn't define the next method.
>>
>> Now I'm wondering if this is on purpose, or a bug. It's odd that every 
>> stream (or almost) from the python library such as everything from the .io 
>> module or simply the object returned by open() (python 2) supports the next 
>> call but the django File wrapper doesn't.
>>
>> This happened with django 1.10 and I believe it wasn't changed with 
>> Django 1.11
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-develop...@googlegroups.com .
>> To post to this group, send email to django-d...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/django-developers.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-developers/09ffad76-e473-4593-ac84-4bca7f76e92c%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/django-developers/09ffad76-e473-4593-ac84-4bca7f76e92c%40googlegroups.com?utm_medium=email_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adam
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/cb7c4945-5932-44a0-8eb8-4571813a0ba3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Should django File wrapper support .next()?

2017-07-27 Thread Cristiano Coelho

Hello,

I have recently found an interesting issue, using a project that relies on 
different storage backends, when switching from a custom one to django's 
file system storage, I found that existing code that would iterate files 
with the next() call would start to fail, since although django's File is 
iterable, it doesn't define the next method.

Now I'm wondering if this is on purpose, or a bug. It's odd that every 
stream (or almost) from the python library such as everything from the .io 
module or simply the object returned by open() (python 2) supports the next 
call but the django File wrapper doesn't.

This happened with django 1.10 and I believe it wasn't changed with Django 
1.11

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/09ffad76-e473-4593-ac84-4bca7f76e92c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Any reason to not use SHA256 (or newer) for Signer / TimeStampSigner classess?

2018-05-08 Thread Cristiano Coelho

Looks like the Signer class (and perhaps other parts of the code) still use 
SHA1 ([1] and [2]) for the HMAC signing/hashing process.

I'm wondering if there's any specific reason to use SHA1 over newer 
versions, or if it would be worth it to pass the hash algorithm as a 
variable or even config option.



[1] https://github.com/django/django/blob/master/django/core/signing.py#L45
[2] https://github.com/django/django/blob/master/django/utils/crypto.py#L23

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/a86b456a-2a01-44d4-8100-be7ca62ead40%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Any reason to not use SHA256 (or newer) for Signer / TimeStampSigner classess?

2018-05-09 Thread Cristiano Coelho

Right, that backwards compatibility issue seems quite difficult to solve, 
although if the worst thing to happen is that all users are logged out, it 
shouldn't be that bad. Will read the ticket in detail.

El martes, 8 de mayo de 2018, 20:31:28 (UTC-3), Tim Graham escribió:
>
> There's a ticket about it: https://code.djangoproject.com/ticket/27468
>
> Backwards compatibility is the main consideration.
>
> On Tuesday, May 8, 2018 at 6:44:05 PM UTC-4, Cristiano Coelho wrote:
>>
>> Looks like the Signer class (and perhaps other parts of the code) still 
>> use SHA1 ([1] and [2]) for the HMAC signing/hashing process.
>>
>> I'm wondering if there's any specific reason to use SHA1 over newer 
>> versions, or if it would be worth it to pass the hash algorithm as a 
>> variable or even config option.
>>
>>
>>
>> [1] 
>> https://github.com/django/django/blob/master/django/core/signing.py#L45
>> [2] 
>> https://github.com/django/django/blob/master/django/utils/crypto.py#L23
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/6cdfdeca-e8f1-4cda-8c28-f623136e88be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Add Alias or annotations without group-by support?

2017-12-26 Thread Cristiano Coelho

Hello, I'm having a hard time explaining the exact issue but I hope it's 
clear enough.


Following this issue 
(https://groups.google.com/forum/#!searchin/django-users/cristiano%7Csort:date/django-users/q6XdfyK29HA/TcE8oFitBQAJ)
 
from django users and a related ticket 
(https://code.djangoproject.com/ticket/27719) that seems to be left out or 
forgotten already.

There has to be a way to alias or annotate a value given an expression or 
SQL Function that doesn't necessarily aggregates data but rather work on a 
single value.

Right now as shown on the django-users post, using annotate for this 
purpose will cause unexpected grouping and sub querying that could result 
in very slow and hard to debug queries.

The core issue is that using annotate without a previous call either vaues 
or values_list, will work as expected, simply annotating a value and 
returning it as an additional column, but if an aggregate is added 
afterwards (such as count), the final query ends up being a redundant query 
where the annotated value is added to a group by clause (group by id + 
column), to a column as part of the select (function called twice) and then 
wrapped into a select * (subquery), which makes the extra column as part of 
the select and group by useless, unless the query had any kind of 
left/inner join in which case the group by might make sense (although not 
sure about the column showing up on the select clause)

The ugly work around is to simply add a .values('id') at the end so the 
annotated value doesn't show on the group by and select sections, although 
the nested query still happens.


For this reason, there's currently no way to achieve the above without ugly 
work arounds or unnecessary database performance hits.

The easiest option I believe would be to follow the ticket in order to 
implement an alias call that works exactly like annotate but doesn't 
trigger any grouping.

A more complicated option is probably trying to make annotate/aggregate 
smarter, so all the unnecessary grouping and sub querying doesn't happen 
unless needed, for example, if the queryset didn't call values/values_list 
or if there are no relationships/joins used.


Example/demostration:

Given the following queryset

query1 = MyModel.objects.annotate(x=MyFunction('a', 'b')).filter(x__gte=0.6
).order_by('-x')


query1 SQL is good and looks like:

SELECT id, a, b, myfunction(a, b) as x
FROM mymodel
WHERE myfunction(a, b) >= 0.6
ORDER BY x desc

Notice how there's no group by, the ORM was smart enough to not include it 
since there was no previous call to values/values_list


If we run query1.count() the final SQL looks like:

SELECT COUNT(*) FROM (
SELECT id, myfunction(a, b) as x
FROM mymodel
WHERE myfunction(a ,b) >= 0.6
GROUP BY id, myfunction(a ,b)
) subquery

which if myfunction is slow, will add a massive slow down that's not even 
needed, and should actually be just:

SELECT count(*)
FROM mymodel
WHERE myfunction(a ,b) >= 0.6


while the other query should ONLY happen if the group by makes sense (i.e, 
if there's a join somewhere, or a values/values_list was used previously so 
id is not part of the group by statement)

but if we work around the issue adding a query1.values('id').count(), the 
final query ends up better:

SELECT COUNT(*) FROM (
SELECT id
FROM mymodel
WHERE myfunction(a ,b) >= 0.6
) subquery


I hope I could explain this clear enough with the example, and note that 
using a custom lookup is not possible since the value is required for the 
order_by to work.


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/4e4dbcd9-9c49-468b-b633-ca27daf3fe69%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Add Alias or annotations without group-by support?

2018-03-08 Thread Cristiano Coelho

The workaround, although extremely ugly and which will probably cause 
issues in the future (reason I only used it for the model I needed to do 
those odd queries) was to use a custom queryset/manager. Something like 
this.

class FasterCountQuerySet(QuerySet):
def count(self):
return super(FasterCountQuerySet, self.values('pk')).count()
FasterCountManager = Manager.from_queryset(FasterCountQuerySet)

But again, this is extremely ugly and will still cause a subquery, but 
without the unnecessary group by and extra function calls.


El miércoles, 7 de marzo de 2018, 19:48:01 (UTC-3), Jared Proffitt escribió:
>
> I have also run into this exact problem. Would love to get this fixed. 
> Have you found a good workaround?
>
> On Tuesday, December 26, 2017 at 12:37:16 PM UTC-6, Cristiano Coelho wrote:
>>
>> Hello, I'm having a hard time explaining the exact issue but I hope it's 
>> clear enough.
>>
>>
>> Following this issue (
>> https://groups.google.com/forum/#!searchin/django-users/cristiano%7Csort:date/django-users/q6XdfyK29HA/TcE8oFitBQAJ)
>>  
>> from django users and a related ticket (
>> https://code.djangoproject.com/ticket/27719) that seems to be left out 
>> or forgotten already.
>>
>> There has to be a way to alias or annotate a value given an expression or 
>> SQL Function that doesn't necessarily aggregates data but rather work on a 
>> single value.
>>
>> Right now as shown on the django-users post, using annotate for this 
>> purpose will cause unexpected grouping and sub querying that could result 
>> in very slow and hard to debug queries.
>>
>> The core issue is that using annotate without a previous call either 
>> vaues or values_list, will work as expected, simply annotating a value and 
>> returning it as an additional column, but if an aggregate is added 
>> afterwards (such as count), the final query ends up being a redundant query 
>> where the annotated value is added to a group by clause (group by id + 
>> column), to a column as part of the select (function called twice) and then 
>> wrapped into a select * (subquery), which makes the extra column as part of 
>> the select and group by useless, unless the query had any kind of 
>> left/inner join in which case the group by might make sense (although not 
>> sure about the column showing up on the select clause)
>>
>> The ugly work around is to simply add a .values('id') at the end so the 
>> annotated value doesn't show on the group by and select sections, although 
>> the nested query still happens.
>>
>>
>> For this reason, there's currently no way to achieve the above without 
>> ugly work arounds or unnecessary database performance hits.
>>
>> The easiest option I believe would be to follow the ticket in order to 
>> implement an alias call that works exactly like annotate but doesn't 
>> trigger any grouping.
>>
>> A more complicated option is probably trying to make annotate/aggregate 
>> smarter, so all the unnecessary grouping and sub querying doesn't happen 
>> unless needed, for example, if the queryset didn't call values/values_list 
>> or if there are no relationships/joins used.
>>
>>
>> Example/demostration:
>>
>> Given the following queryset
>>
>> query1 = MyModel.objects.annotate(x=MyFunction('a', 'b')).filter(x__gte=
>> 0.6).order_by('-x')
>>
>>
>> query1 SQL is good and looks like:
>>
>> SELECT id, a, b, myfunction(a, b) as x
>> FROM mymodel
>> WHERE myfunction(a, b) >= 0.6
>> ORDER BY x desc
>>
>> Notice how there's no group by, the ORM was smart enough to not include 
>> it since there was no previous call to values/values_list
>>
>>
>> If we run query1.count() the final SQL looks like:
>>
>> SELECT COUNT(*) FROM (
>> SELECT id, myfunction(a, b) as x
>> FROM mymodel
>> WHERE myfunction(a ,b) >= 0.6
>> GROUP BY id, myfunction(a ,b)
>> ) subquery
>>
>> which if myfunction is slow, will add a massive slow down that's not even 
>> needed, and should actually be just:
>>
>> SELECT count(*)
>> FROM mymodel
>> WHERE myfunction(a ,b) >= 0.6
>>
>>
>> while the other query should ONLY happen if the group by makes sense 
>> (i.e, if there's a join somewhere, or a values/values_list was used 
>> previously so id is not part of the group by statement)
>>
>> but if we work around the issue adding a query1.values('id').count(), the 
>> final query ends up better:
>>
>> SELECT COUNT(*) FROM (
>> SELECT id
&

Re: Add Alias or annotations without group-by support?

2018-03-10 Thread Cristiano Coelho

Would that actually end up executing the same function twice?

I didn't state it on the original question, but the biggest issue is that 
on my use case, the annotation step is actually rather complicated and such 
wrapped in a method on the model, and then it's up to the external code to 
filter and sort by the annotated value. Having to use the expression every 
single time it's needed would defeat the purpose of it.

I agree though that having more methods on the queryset is bad, I would 
rather improve the annotation logic to be able to handle these cases, but 
might also be difficult.


El sábado, 10 de marzo de 2018, 8:51:32 (UTC-3), Josh Smeaton escribió:
>
> Sure - but you can always save the expression to a variable and use it 
> multiple times.
>
> mycalc = MyFunc('a', 'b')
> Model.objects.filter(GreaterEqual(mycalc, 0.6)).order_by(mycalc)
>
> I think we already have the building blocks we need to avoid adding 
> another queryset method.
>
> On Saturday, 10 March 2018 14:01:41 UTC+11, Cristiano Coelho wrote:
>>
>> It wouldn't work if you also want to order by the annotated value.
>>
>> El viernes, 9 de marzo de 2018, 8:27:36 (UTC-3), Josh Smeaton escribió:
>>>
>>> Would teaching filter() and friends to use expressions directly solve 
>>> your issue? You suggested using `alias` upthread, but that's only really 
>>> required so you can refer to it later? Unless you wanted to refer to the 
>>> field more than once, having each queryset method respect expressions 
>>> should be enough I think.
>>>
>>> https://github.com/django/django/pull/8119 adds boolean expression 
>>> support to filter. I believe most other queryset methods have support for 
>>> expressions now (order_by, values/values_list).
>>>
>>> For the alias used multiple times case, it should be enough to annotate 
>>> and then restrict with values if you don't actually want it in the 
>>> select/group list.
>>>
>>> On Friday, 9 March 2018 00:22:00 UTC+11, Cristiano Coelho wrote:
>>>>
>>>> The workaround, although extremely ugly and which will probably cause 
>>>> issues in the future (reason I only used it for the model I needed to do 
>>>> those odd queries) was to use a custom queryset/manager. Something like 
>>>> this.
>>>>
>>>> class FasterCountQuerySet(QuerySet):
>>>> def count(self):
>>>> return super(FasterCountQuerySet, self.values('pk')).count()
>>>> FasterCountManager = Manager.from_queryset(FasterCountQuerySet)
>>>>
>>>> But again, this is extremely ugly and will still cause a subquery, but 
>>>> without the unnecessary group by and extra function calls.
>>>>
>>>>
>>>> El miércoles, 7 de marzo de 2018, 19:48:01 (UTC-3), Jared Proffitt 
>>>> escribió:
>>>>>
>>>>> I have also run into this exact problem. Would love to get this fixed. 
>>>>> Have you found a good workaround?
>>>>>
>>>>> On Tuesday, December 26, 2017 at 12:37:16 PM UTC-6, Cristiano Coelho 
>>>>> wrote:
>>>>>>
>>>>>> Hello, I'm having a hard time explaining the exact issue but I hope 
>>>>>> it's clear enough.
>>>>>>
>>>>>>
>>>>>> Following this issue (
>>>>>> https://groups.google.com/forum/#!searchin/django-users/cristiano%7Csort:date/django-users/q6XdfyK29HA/TcE8oFitBQAJ)
>>>>>>  
>>>>>> from django users and a related ticket (
>>>>>> https://code.djangoproject.com/ticket/27719) that seems to be left 
>>>>>> out or forgotten already.
>>>>>>
>>>>>> There has to be a way to alias or annotate a value given an 
>>>>>> expression or SQL Function that doesn't necessarily aggregates data but 
>>>>>> rather work on a single value.
>>>>>>
>>>>>> Right now as shown on the django-users post, using annotate for this 
>>>>>> purpose will cause unexpected grouping and sub querying that could 
>>>>>> result 
>>>>>> in very slow and hard to debug queries.
>>>>>>
>>>>>> The core issue is that using annotate without a previous call either 
>>>>>> vaues or values_list, will work as expected, simply annotating a value 
>>>>>> and 
>>>>>> returning it as an additional column, but if an aggregate is added 
>>>>>> afterwards (

Re: Add Alias or annotations without group-by support?

2018-03-09 Thread Cristiano Coelho

It wouldn't work if you also want to order by the annotated value.

El viernes, 9 de marzo de 2018, 8:27:36 (UTC-3), Josh Smeaton escribió:
>
> Would teaching filter() and friends to use expressions directly solve your 
> issue? You suggested using `alias` upthread, but that's only really 
> required so you can refer to it later? Unless you wanted to refer to the 
> field more than once, having each queryset method respect expressions 
> should be enough I think.
>
> https://github.com/django/django/pull/8119 adds boolean expression 
> support to filter. I believe most other queryset methods have support for 
> expressions now (order_by, values/values_list).
>
> For the alias used multiple times case, it should be enough to annotate 
> and then restrict with values if you don't actually want it in the 
> select/group list.
>
> On Friday, 9 March 2018 00:22:00 UTC+11, Cristiano Coelho wrote:
>>
>> The workaround, although extremely ugly and which will probably cause 
>> issues in the future (reason I only used it for the model I needed to do 
>> those odd queries) was to use a custom queryset/manager. Something like 
>> this.
>>
>> class FasterCountQuerySet(QuerySet):
>> def count(self):
>> return super(FasterCountQuerySet, self.values('pk')).count()
>> FasterCountManager = Manager.from_queryset(FasterCountQuerySet)
>>
>> But again, this is extremely ugly and will still cause a subquery, but 
>> without the unnecessary group by and extra function calls.
>>
>>
>> El miércoles, 7 de marzo de 2018, 19:48:01 (UTC-3), Jared Proffitt 
>> escribió:
>>>
>>> I have also run into this exact problem. Would love to get this fixed. 
>>> Have you found a good workaround?
>>>
>>> On Tuesday, December 26, 2017 at 12:37:16 PM UTC-6, Cristiano Coelho 
>>> wrote:
>>>>
>>>> Hello, I'm having a hard time explaining the exact issue but I hope 
>>>> it's clear enough.
>>>>
>>>>
>>>> Following this issue (
>>>> https://groups.google.com/forum/#!searchin/django-users/cristiano%7Csort:date/django-users/q6XdfyK29HA/TcE8oFitBQAJ)
>>>>  
>>>> from django users and a related ticket (
>>>> https://code.djangoproject.com/ticket/27719) that seems to be left out 
>>>> or forgotten already.
>>>>
>>>> There has to be a way to alias or annotate a value given an expression 
>>>> or SQL Function that doesn't necessarily aggregates data but rather work 
>>>> on 
>>>> a single value.
>>>>
>>>> Right now as shown on the django-users post, using annotate for this 
>>>> purpose will cause unexpected grouping and sub querying that could result 
>>>> in very slow and hard to debug queries.
>>>>
>>>> The core issue is that using annotate without a previous call either 
>>>> vaues or values_list, will work as expected, simply annotating a value and 
>>>> returning it as an additional column, but if an aggregate is added 
>>>> afterwards (such as count), the final query ends up being a redundant 
>>>> query 
>>>> where the annotated value is added to a group by clause (group by id + 
>>>> column), to a column as part of the select (function called twice) and 
>>>> then 
>>>> wrapped into a select * (subquery), which makes the extra column as part 
>>>> of 
>>>> the select and group by useless, unless the query had any kind of 
>>>> left/inner join in which case the group by might make sense (although not 
>>>> sure about the column showing up on the select clause)
>>>>
>>>> The ugly work around is to simply add a .values('id') at the end so the 
>>>> annotated value doesn't show on the group by and select sections, although 
>>>> the nested query still happens.
>>>>
>>>>
>>>> For this reason, there's currently no way to achieve the above without 
>>>> ugly work arounds or unnecessary database performance hits.
>>>>
>>>> The easiest option I believe would be to follow the ticket in order to 
>>>> implement an alias call that works exactly like annotate but doesn't 
>>>> trigger any grouping.
>>>>
>>>> A more complicated option is probably trying to make annotate/aggregate 
>>>> smarter, so all the unnecessary grouping and sub querying doesn't happen 
>>>> unless needed, for example, if the queryset didn't call values/values_list 
>>>> or i

Re: A faster paginator for django

2018-12-21 Thread Cristiano Coelho

Let's not forget how the various *count *calls starts to kill your database 
when you get over 1 million rows (postgres at least).

So far the only options I have found with postgres are:
- Estimate count for non filtered queries: SELECT reltuples::BIGINT FROM 
pg_class WHERE relname = '%s';
- If queries are filtered, replace it with a subquery that will first limit 
the results to a reasonable number (such as 1k), this is silly, and won't 
allow you to go through all results but at least the count call won't kill 
your database. This is only useful if the filtered query returns over one 
million rows as well.




El miércoles, 5 de diciembre de 2018, 7:15:22 (UTC-5), Saleem Jaffer 
escribió:
>
> Hi all,
>
> The default paginator that comes with Django is inefficient when dealing 
> with large tables. This is because the final query for fetching pages uses 
> "OFFSET" which is basically a linear scan till the last index of the 
> current page. Does it make sense to have a better paginator which does not 
> use "OFFSET". 
>
> If this sounds like a good idea, I have some ideas on how to do it and 
> with some help from you guys I can implement it.
>
> Saleem
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f6c2eb33-0514-48c3-84c6-e29cf8592f58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Use CDN for djangoproject.com

2019-02-13 Thread Cristiano Coelho

Consider AWS's cloudfront then :)

El martes, 12 de febrero de 2019, 2:34:09 (UTC-5), Florian Apolloner 
escribió:
>
> Especially cloudflare is a service we do not want to use. as for the docs 
> only, does the mirror on rtd work better for you? They are probably behind 
> a CDN.
>
> Cheers,
> Florian
>
> On Tuesday, February 12, 2019 at 6:43:41 AM UTC+1, Cheng C wrote:
>>
>> Hi,
>>
>> Is it possible to utilize a CDN service for djangoproject.com, or at 
>> least on docs.djangoproject.com? The site is actually quite fast for me 
>> but I think there is still room for improvement. Cloudflare sponsored 
>> dozens of open source projects 
>> , probably they can 
>> provide free service for django as well.
>>
>> Tested from Melbourne, Australia:
>>
>> https://www.djangoproject.com/
>>  Average Ping: 245ms
>>  Browser: 21 requests, 211KB transferred, Finish: 2.52s, 
>> DOMContentLoaded: 1.16s, Load: 1.48s
>>
>> https://git-scm.com/
>>  Average Ping: 5ms
>>  Browser: 42 requests, 351KB transferred, Finish: 717ms, 
>> DOMContentLoaded: 564ms, Load: 699ms
>>
>> Tested on Chrome with "Disable cache" checked (but not the first time 
>> visit, so DNS query time might not be included).
>>
>> Best regards and thanks for all your great work. 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/548db807-647f-4d0b-99c2-f9f229f7175e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Extend FAQ with "How do I get Django and my JS framework to work together?"

2019-02-04 Thread Cristiano Coelho

Pointing to Django Rest Framework should be more than enough. Anyone 
starting a project with Django and a SPA/JS architecture, will pretty much 
end up with DRF. Correct me if I'm wrong.


El lunes, 4 de febrero de 2019, 19:52:42 (UTC-5), Maciek Olko escribió:
>
> I didn't find this topic being discussed before.
>
> It seems to me to be good idea to place "How do I get Django and my JS 
> framework to work together?" or similar question and answer to it in FAQ in 
> Django's docs.
>
> Having very big popularity of JS frameworks now it is indeed very common 
> question being asked or searched on the Internet. Such question for ReactJS 
> has 65k views on StackOverflow [1]. 
>
> The answer could briefly explain that one can make use of Django backend 
> and produce API to be consumed by JS clients. Probably it would be good to 
> link to Django REST API project as a suggestion for big projects.
>
> What do you think about such addition to FAQ?
>
> Regards,
> Maciej
>
> [1] 
> https://stackoverflow.com/questions/41867055/how-to-get-django-and-reactjs-to-work-together
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/87fb9a35-b274-4d62-b17c-602114719ad5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Should SECRET_KEY be allowed to be bytes?

2022-08-03 Thread Cristiano Coelho

Years later, sorry. But this is still broken and SECRET_KEY management is a 
mess!

Even though you can now use bytes, this line here will blow up if you 
attempt to use bytes as a secret key: 
https://github.com/django/django/blob/3.2.14/django/core/checks/security/base.py#L202

Basically, we are unable to use bytes in the secret key (and we want bytes 
so we can use a short string for other HMAC/signing operations). This means 
that our "str" keys will be twice as big (if encoded as hex) and will also 
always end up hashed because a 64 bytes secret will be a 128 string hex, 
which is over HMAC-SHA256 block size.

El Wednesday, December 28, 2016 a la(s) 6:45:50 AM UTC-3, Aymeric Augustin 
escribió:

> I’m happy with that.
>
> -- 
> Aymeric.
>
> On 27 Dec 2016, at 19:49, Tim Graham  wrote:
>
> Thanks Aymeric. How about this documentation addition:
>
> Uses of the key shouldn't assume that it's text or bytes. Every use should 
> go
> through :func:`~django.utils.encoding.force_text` or
> :func:`~django.utils.encoding.force_bytes` to convert it to the desired 
> type.
>
> https://github.com/django/django/pull/7750
>
> Adam created https://code.djangoproject.com/ticket/27635 about the "use 
> secrets" idea.
>
> On Saturday, December 24, 2016 at 4:52:38 PM UTC-5, Aymeric Augustin wrote:
>>
>> Hello Andres,
>>
>> We both seem to agree with the status quo — supporting both text and 
>> bytes.
>>
>>
>> On 24 Dec 2016, at 00:36, 'Andres Mejia' via Django developers 
>> (Contributions to Django itself)  wrote:
>>
>> On 12/22/2016 05:15 PM, Aymeric Augustin wrote:
>>
>> export SECRET_KEY=… # generated with pwgen -s 50
>>
>> What do you think is ultimately being used in the pwgen program? I'm 
>> going to guess, at least on POSIX systems, it is /dev/urandom or 
>> /dev/random, both of which return random bytes.
>>
>>
>> I understand this, but it doesn’t change my argument. I’m saying that the 
>> format of SECRET_KEY doesn’t matter, as long as it contain enough entropy, 
>> since it will be injected in hashing algorithms designed to extract the 
>> entropy. I think we can agree on this.
>>
>> We have different preferences for that format. You like keeping the 
>> original raw binary data SECRET_KEY. I find it more convenient to convert 
>> it to an ASCII-safe format, for example with pwgen. I really think this 
>> boils down to taste. I don’t think we can conclusively determine that one 
>> approach is superior to the other. I think my technique is more beginner 
>> friendly; while not applicable to you, it’s a concern for Django in general.
>>
>> The only cost of supporting both options is that every use must go either 
>> through force_text or force_bytes to convert to a known type. 
>>
>> - “I think it's fair to assume devs using the SECRET_KEY know it must be 
>> used as bytes.” — well that doesn't include me or any Django dev I ever 
>> talked to about this topic
>>
>> (..)
>>
>>
>> Oops, I misunderstood “used as bytes” to mean “defined as bytes”. Sorry. 
>> I withdraw this.
>>
>>
>> And since I’ve been waving my hands about the types Django expects in a 
>> previous email, here’s the full audit. Below, *text* means unicode on 
>> Python 2 and str on Python 3. *ASCII-safe bytes* means bytes containing 
>> only ASCII-characters, so they can be used transparently as if they were 
>> text on Python 2, because it will call decode() implicitly.
>>
>> - django/conf/global_settings.py
>>
>> Sets the default to an empty *text* string (note the unicode_literals 
>> import at the top for Python 2).
>>
>> - django/conf/settings.py-tpl
>>
>> Sets the generated value to *ASCII-safe bytes* on Python 2 and *text* on 
>> Python 3 (no unicode_literals there).
>>
>> - django/core/signing.py:
>>
>> Calls force_bytes to support *bytes* and *text* in get_cookie_signer.
>>
>> - django/utils/crypto.py:
>>
>> Calls force_bytes to support *bytes* and *text* in salted_hmac.
>>
>> *Assumes SECRET_KEY contains text* in the `if not using_sysrandom` 
>> branch of `get_random_string`. This is the bug I hinted to in a previous 
>> email. It must have appeared when adding the unicode_literals import to 
>> that file. No one complained since June 2012. It only affects people 
>> setting their SECRET_KEY to bytes on Python 3 or ASCII-unsafe bytes on 
>> Python 2 on Unix-like systems that don’t provide /dev/urandom. This sounds 
>> uncommon.
>>
>> While we’re there, we should use 
>> https://docs.python.org/3/library/secrets.html#module-secrets on Python 
>> >= 3.6.
>>
>>
>> Best regards,
>>
>> -- 
>> Aymeric.
>>
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
>
>
> Visit this group at https://groups.google.com/group/django-developers.
>
> To view this

92 matches

Mail list logo