date:20190114

Re: QuerySet.iterator together with prefetch_related because of chunk_size

2019-01-14 Thread Adam Johnson

>
> ...what if we required chunk_size to be explicitly specified when the
> queryset has prefetch lookups?


This seems reasonable, but would you do in the case chunk_size isn't
explicitly defined - throw an exception? Currently it silently fails to
prefetch which means N + 1 queries, so even prefetching for the default
chunk_size of 2000 would be a huge gain in cases where chunk_size isn't
defined.

On Sun, 13 Jan 2019 at 02:05, charettes  wrote:

> Replying to my concerns about the P * C queries.
>
> Throwing that idea out there but what if we required chunk_size to be
> explicitly specified when the queryset has prefetch lookups?
>
> That would at least force the developers to consider how many prefetching
> queries iterating through the results would require. Plus since the
> argument was only recently introduced it is less likely to silently cause
> changes under developers feet.
>
> Simon
>
> Le vendredi 26 octobre 2018 20:12:02 UTC-4, charettes a écrit :
>>
>> Josh, I agree that silently not working is problematic but it has been
>> this way since prefetch_related() was introduced.
>>
>> Something to keep in mind as well is that silently turning it on would
>> also perform P * C extra queries where P is the number of prefetches
>> requested through prefetch_related() and C the number of chunks the results
>> contains. This is non negligible IMO.
>>
>> What I'd be in favor off is raising an exception on
>> prefetch_related(*prefetches).iterator() in the next release at least to
>> allow users using this pattern to adjust their code and then ship the
>> optimization with all the warnings related to the interactions between
>> prefetch_related(*prefetches) and iterator(chunk_size) in the next one.
>>
>> I'm not completely completely against skipping the exception release
>> phase entirely given there might be users out there accessing attributes
>> expected to be prefetched on iterated instances and inadvertently
>> performing tons of queries but the exception phase just feels safer given
>> iterator() is usually used in memory constrained contexts.
>>
>> Simon
>>
>> Le vendredi 26 octobre 2018 19:27:55 UTC-4, Josh Smeaton a écrit :
>>>
>>> I tend to agree with Tobi. Prefetching silently not working on iterator
>>> can be quite confusing, unless you have a good understanding of both APIs.
>>> It might be possible to do what you're asking, but it'd mean that django is
>>> now actually caching the result when it explicitly says it isn't - even if
>>> the result is a much smaller moving cache. Prefetching chunk_size results
>>> per chunk is unlikely to make a material difference to memory usage. Users
>>> are usually concerned about the entire result set of the primary table.
>>>
>>> I don't know if you can change the API to make these suggested changes
>>> without also impacting how we iterate over result sets - but I'd be
>>> interested in seeing a proof of concept at the very least.
>>>
>>>
>>>
>>> On Monday, 15 October 2018 20:41:13 UTC+11, tobias@truffls.com
>>> wrote:

 Thank you for your feedback. I would like to answer some statements to
 either convince you or make it more clear, where my idea stems from:

 The fundamental reason why iterator() cannot be used with
> prefetch_related() is that the latter requires a set of model instance to
> be materialized to work appropriately which chunk_size doesn't control at
> all.
> In other words chunk_size only controls how many rows should be
> fetched from the database cursor and kept into memory at a time. Even when
> this parameter is used, iterator() will only materialize a single model
> instance per yield.
>

 It should be easily possible to change the involved code of
 ModelIterable to materialize the retrieved rows in batches. After
 materializing the batch / chunk, it could do the prefetching.


> Given that iterator() always ignored prefetch related lookups instead
> of erroring out when they were specified make me feel like turning such a
> feature on by default could be problematic as it could balloon the memory
> usage which is the main reason why iterator is useful anyway.
>

 I would argue, that users who thoughtlessly applied prefetching
 together with iterator now actually get, what they thought of: less DB
 query round trips traded against a little more memory usage.

 Best,
 Tobi

>>> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
>

Re: QuerySet.iterator together with prefetch_related because of chunk_size

2019-01-14 Thread charettes

> This seems reasonable, but would you do in the case chunk_size isn't 
explicitly defined - throw an exception?

That's would be the plan.

> Currently it silently fails to prefetch which means N + 1 queries, so 
even prefetching for the default chunk_size of 2000 would be a huge gain in 
cases where chunk_size isn't defined.

That's assuming related relationships are actually accessed during 
iteration over the objects. If it's not currently the case turning that on 
would cause P + C extra queries.

To summarize here's the options we have:

1. Always turn on the prefetching feature.
2. Require chunk_size to be explicitly specified for a deprecation period 
in order to enable the feature.

Pros/Cons of 1. is that developers who actually didn't follow the 
documentation and/or didn't assert that prefetching didn't work and left 
some lingering prefetch_related() will probably see their code perform less 
queries but result in a slight increase in memory (given they discard 
objects on iterating). This is also likely to break code because of some 
backends (e.g. SQLite) query parameters limits (chunk_size=2000 > 1000)[0]

Pros/Cons of 2. is that most prefetch_related().iterator() won't break for 
developers that followed the documentation and fail for others requiring 
them to acknowledge what their code is going to start doing.

As I've expressed in my previous messages I'm slightly in favor of 2 even 
if it is likely to cause more breakage because of the exceptional situation 
where this API should have raised an exception from the beginning.

Cheers,
Simon

[0] https://code.djangoproject.com/ticket/27833

Le lundi 14 janvier 2019 09:53:45 UTC-5, Adam Johnson a écrit :
>
> ...what if we required chunk_size to be explicitly specified when the 
>> queryset has prefetch lookups?
>
>
> This seems reasonable, but would you do in the case chunk_size isn't 
> explicitly defined - throw an exception? Currently it silently fails to 
> prefetch which means N + 1 queries, so even prefetching for the default 
> chunk_size of 2000 would be a huge gain in cases where chunk_size isn't 
> defined.
>
> On Sun, 13 Jan 2019 at 02:05, charettes > 
> wrote:
>
>> Replying to my concerns about the P * C queries.
>>
>> Throwing that idea out there but what if we required chunk_size to be 
>> explicitly specified when the queryset has prefetch lookups?
>>
>> That would at least force the developers to consider how many prefetching 
>> queries iterating through the results would require. Plus since the 
>> argument was only recently introduced it is less likely to silently cause 
>> changes under developers feet.
>>
>> Simon
>>
>> Le vendredi 26 octobre 2018 20:12:02 UTC-4, charettes a écrit :
>>>
>>> Josh, I agree that silently not working is problematic but it has been 
>>> this way since prefetch_related() was introduced.
>>>
>>> Something to keep in mind as well is that silently turning it on would 
>>> also perform P * C extra queries where P is the number of prefetches 
>>> requested through prefetch_related() and C the number of chunks the results 
>>> contains. This is non negligible IMO.
>>>
>>> What I'd be in favor off is raising an exception on 
>>> prefetch_related(*prefetches).iterator() in the next release at least to 
>>> allow users using this pattern to adjust their code and then ship the 
>>> optimization with all the warnings related to the interactions between 
>>> prefetch_related(*prefetches) and iterator(chunk_size) in the next one.
>>>
>>> I'm not completely completely against skipping the exception release 
>>> phase entirely given there might be users out there accessing attributes 
>>> expected to be prefetched on iterated instances and inadvertently 
>>> performing tons of queries but the exception phase just feels safer given 
>>> iterator() is usually used in memory constrained contexts.
>>>
>>> Simon
>>>
>>> Le vendredi 26 octobre 2018 19:27:55 UTC-4, Josh Smeaton a écrit :

 I tend to agree with Tobi. Prefetching silently not working on iterator 
 can be quite confusing, unless you have a good understanding of both APIs. 
 It might be possible to do what you're asking, but it'd mean that django 
 is 
 now actually caching the result when it explicitly says it isn't - even if 
 the result is a much smaller moving cache. Prefetching chunk_size results 
 per chunk is unlikely to make a material difference to memory usage. Users 
 are usually concerned about the entire result set of the primary table.

 I don't know if you can change the API to make these suggested changes 
 without also impacting how we iterate over result sets - but I'd be 
 interested in seeing a proof of concept at the very least.

 On Monday, 15 October 2018 20:41:13 UTC+11, tobias@truffls.com 
 wrote:
>
> Thank you for your feedback. I would like to answer some statements to 
> either convince you or make it more clear, where my idea stems

Re: QuerySet.iterator together with prefetch_related because of chunk_size

2019-01-14 Thread Adam Johnson

I agree with your reasoning, and also favour option 2 now, especially since
the default can break on sqlite.

It would also be possible to go through a deprecation period to first raise
a warning and not prefetch, before a later version raises an exception,
which is probably kinder since previously it was documented that it just
was a no-op.

On Mon, 14 Jan 2019 at 19:03, charettes  wrote:

> > This seems reasonable, but would you do in the case chunk_size isn't
> explicitly defined - throw an exception?
>
> That's would be the plan.
>
> > Currently it silently fails to prefetch which means N + 1 queries, so
> even prefetching for the default chunk_size of 2000 would be a huge gain in
> cases where chunk_size isn't defined.
>
> That's assuming related relationships are actually accessed during
> iteration over the objects. If it's not currently the case turning that on
> would cause P + C extra queries.
>
> To summarize here's the options we have:
>
> 1. Always turn on the prefetching feature.
> 2. Require chunk_size to be explicitly specified for a deprecation period
> in order to enable the feature.
>
> Pros/Cons of 1. is that developers who actually didn't follow the
> documentation and/or didn't assert that prefetching didn't work and left
> some lingering prefetch_related() will probably see their code perform less
> queries but result in a slight increase in memory (given they discard
> objects on iterating). This is also likely to break code because of some
> backends (e.g. SQLite) query parameters limits (chunk_size=2000 > 1000)[0]
>
> Pros/Cons of 2. is that most prefetch_related().iterator() won't break for
> developers that followed the documentation and fail for others requiring
> them to acknowledge what their code is going to start doing.
>
> As I've expressed in my previous messages I'm slightly in favor of 2 even
> if it is likely to cause more breakage because of the exceptional situation
> where this API should have raised an exception from the beginning.
>
> Cheers,
> Simon
>
> [0] https://code.djangoproject.com/ticket/27833
>
> Le lundi 14 janvier 2019 09:53:45 UTC-5, Adam Johnson a écrit :
>>
>> ...what if we required chunk_size to be explicitly specified when the
>>> queryset has prefetch lookups?
>>
>>
>> This seems reasonable, but would you do in the case chunk_size isn't
>> explicitly defined - throw an exception? Currently it silently fails to
>> prefetch which means N + 1 queries, so even prefetching for the default
>> chunk_size of 2000 would be a huge gain in cases where chunk_size isn't
>> defined.
>>
>> On Sun, 13 Jan 2019 at 02:05, charettes  wrote:
>>
>>> Replying to my concerns about the P * C queries.
>>>
>>> Throwing that idea out there but what if we required chunk_size to be
>>> explicitly specified when the queryset has prefetch lookups?
>>>
>>> That would at least force the developers to consider how many
>>> prefetching queries iterating through the results would require. Plus since
>>> the argument was only recently introduced it is less likely to silently
>>> cause changes under developers feet.
>>>
>>> Simon
>>>
>>> Le vendredi 26 octobre 2018 20:12:02 UTC-4, charettes a écrit :

 Josh, I agree that silently not working is problematic but it has been
 this way since prefetch_related() was introduced.

 Something to keep in mind as well is that silently turning it on would
 also perform P * C extra queries where P is the number of prefetches
 requested through prefetch_related() and C the number of chunks the results
 contains. This is non negligible IMO.

 What I'd be in favor off is raising an exception on
 prefetch_related(*prefetches).iterator() in the next release at least to
 allow users using this pattern to adjust their code and then ship the
 optimization with all the warnings related to the interactions between
 prefetch_related(*prefetches) and iterator(chunk_size) in the next one.

 I'm not completely completely against skipping the exception release
 phase entirely given there might be users out there accessing attributes
 expected to be prefetched on iterated instances and inadvertently
 performing tons of queries but the exception phase just feels safer given
 iterator() is usually used in memory constrained contexts.

 Simon

 Le vendredi 26 octobre 2018 19:27:55 UTC-4, Josh Smeaton a écrit :
>
> I tend to agree with Tobi. Prefetching silently not working on
> iterator can be quite confusing, unless you have a good understanding of
> both APIs. It might be possible to do what you're asking, but it'd mean
> that django is now actually caching the result when it explicitly says it
> isn't - even if the result is a much smaller moving cache. Prefetching
> chunk_size results per chunk is unlikely to make a material difference to
> memory usage. Users are usually concerned about the entire result set of
> the primary

Re: makemessages - Add an option to disable fuzzy translations?

2019-01-14 Thread אורי

Thank you, I created my own make_messages command:

https://github.com/speedy-net/speedy-net/blob/staging/speedy/core/base/management/commands/make_messages.py
אורי
u...@speedy.net


On Fri, Jan 11, 2019 at 10:10 AM Claude Paroz  wrote:

> Le vendredi 11 janvier 2019 07:46:04 UTC+1, Uri Even-Chen a écrit :
>>
>> https://code.djangoproject.com/ticket/10852
>>
>> I also don't like the "fuzzy" keyword and I spent hours in deleting them
>> from our django.po files after running makemessages.
>>
>
> Hi Uri,
>
> Create your own makemessages command and provide your custom
> msgmerge_options. Something like:
>
> from django.core.management.commands.makemessages import Command as
> MakeMessagesCommand
>
> class Command(MakeMessagesCommand):
> msgmerge_options = ['-q', '--previous', '--no-fuzzy-matching']
>
> As for your second question, your should use a dedicated tool to translate
> messages (poedit, gtranslator, virtaal, lokalize, etc.), so you won't have
> to search for untranslated messages yourself. Even text editors like vim or
> emacs provide po mode translations.
>
> HTH,
>
> Claude
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/285f2dea-2a2a-472f-83ad-0c06c4471026%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CABD5YeFPCKGwxfeGBv0tP2xHTV2pjHcTMsVTH1PXg05Qg0kc1Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Request to reconsider ticket #27910: using an Enum class in model Field choices

2019-01-14 Thread raphael

I saw this thread so wanted to share a bit of a gotcha we had with enums 
internally for anyone trying to handle this stuff

Internally, we have a wrapper around CharField to work with enums (we go 
further than just choices and actually have the values be enums), but 
there's unfortunately still a good amount of issues in certain use cases 
that make it not a great fit to be put into general usage for Django. I do 
think it's possible to get things right, but you have to establish certain 
rules because the way enums work in Django can sometimes be surprising.

We notably had an issue with translations, and I believe the problem would 
occur with other context-sensitive values. Because database operations will 
often copy values before using them to save to the database, you're 
beholden to (what I believe to be) slightly busted semantics on `copy.copy` 
for enums. Probably defining a django-y enum subclass that requires 
identity/lookup helpers would make this more usable for the general public 
(much like what other people have said). 

In [10]: from enum import Enum
In [11]: from django.utils.translation import ugettext_lazy, override
In [12]: class C(Enum):
...: a = ugettext_lazy("Some Word")

In [13]: with override('en'):
...: elt = C.a
...: with override('ja'):
...: copy.copy(elt)
...:
---
ValueErrorTraceback (most recent call last)
 in 
  2 elt = C.a
  3 with override('ja'):
> 4 copy.copy(elt)
  5

/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/copy.py
 
in copy(x)
104 if isinstance(rv, str):
105 return x
--> 106 return _reconstruct(x, None, *rv)
107
108

/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/copy.py
 
in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
272 if deep and args:
273 args = (deepcopy(arg, memo) for arg in args)
--> 274 y = func(*args)
275 if deep:
276 memo[id(x)] = y

/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/enum.py
 
in __call__(cls, value, names, module, qualname, type, start)
289 """
290 if names is None:  # simple value lookup
--> 291 return cls.__new__(cls, value)
292 # otherwise, functional API: we're creating a new Enum type
293 return cls._create_(value, names, module=module, 
qualname=qualname, type=type, start=start)

/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/enum.py
 
in __new__(cls, value)
531 return member
532 # still not found -- try _missing_ hook
--> 533 return cls._missing_(value)
534
535 def _generate_next_value_(name, start, count, last_values):

/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/enum.py
 
in _missing_(cls, value)
544 @classmethod
545 def _missing_(cls, value):
--> 546 raise ValueError("%r is not a valid %s" % (value, 
cls.__name__))
547
548 def __repr__(self):

ValueError: 'Japanese Version of Some Word' is not a valid C

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/ad1b1890-3ce3-46fb-9801-6408de4ed256%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: QuerySet.iterator together with prefetch_related because of chunk_size

Re: QuerySet.iterator together with prefetch_related because of chunk_size

Re: QuerySet.iterator together with prefetch_related because of chunk_size

Re: makemessages - Add an option to disable fuzzy translations?

Re: Request to reconsider ticket #27910: using an Enum class in model Field choices

5 matches

Site Navigation

Mail list logo

Footer information