Re: Clearing prefetch related on add(), change(), remove()

2016-06-07 Thread Florian Apolloner
Same feeling as Carl here. I was probably the first to get asked whether or 
not this was a bug or not on IRC and my initial thought was also "wtf, that 
is clearly a bug" -- hence I asked Yoong Kang Lim to open a ticket.

Cheers,
Florian

On Tuesday, June 7, 2016 at 7:47:29 PM UTC+2, Carl Meyer wrote:
>
> On 06/07/2016 06:11 AM, Marc Tamlyn wrote: 
> > I may be "too close" to knowing the implementation of this feature to be 
> > able to comment on whether the behaviour is surprising to most people, 
> > but it doesn't surprise me. It's certainly analogous to that when you do 
> > `MyModel.objects.create()` it doesn't change an already executed 
> > queryset. There's a question of where you draw the line as well - what 
> > about `related_set.update()`? 
> > 
> > I think it's worth documenting the behaviour, also noting that you can 
> > "force" the execution of a new queryset by chaining another .all(). 
>
> Hmm, I have the opposite instinct. I don't find it analogous to the case 
> of some unrelated queryset object failing to update its internal cache 
> when the database changes. In this case we have a related-manager with 
> an internal cache, and we make changes _via that same manager_. I find 
> it quite surprising that the manager doesn't automatically clear its 
> cache in that case. 
>
> A much stronger precedent, I think, is the fact that Queryset.update() 
> does clear the queryset's internal result cache. In light of that, I 
> think the current behavior with prefetched-data on a related manager is 
> a bug that should be fixed (though it certainly should be mentioned in 
> the release notes). 
>
> Carl 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/61cce517-3117-42ec-aceb-de8bb3a7b7cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Clearing prefetch related on add(), change(), remove()

2016-06-07 Thread Carl Meyer
On 06/07/2016 06:11 AM, Marc Tamlyn wrote:
> I may be "too close" to knowing the implementation of this feature to be
> able to comment on whether the behaviour is surprising to most people,
> but it doesn't surprise me. It's certainly analogous to that when you do
> `MyModel.objects.create()` it doesn't change an already executed
> queryset. There's a question of where you draw the line as well - what
> about `related_set.update()`?
> 
> I think it's worth documenting the behaviour, also noting that you can
> "force" the execution of a new queryset by chaining another .all().

Hmm, I have the opposite instinct. I don't find it analogous to the case
of some unrelated queryset object failing to update its internal cache
when the database changes. In this case we have a related-manager with
an internal cache, and we make changes _via that same manager_. I find
it quite surprising that the manager doesn't automatically clear its
cache in that case.

A much stronger precedent, I think, is the fact that Queryset.update()
does clear the queryset's internal result cache. In light of that, I
think the current behavior with prefetched-data on a related manager is
a bug that should be fixed (though it certainly should be mentioned in
the release notes).

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/575708A1.902%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: OpenPGP digital signature


Re: Clearing prefetch related on add(), change(), remove()

2016-06-07 Thread Marc Tamlyn
I may be "too close" to knowing the implementation of this feature to be
able to comment on whether the behaviour is surprising to most people, but
it doesn't surprise me. It's certainly analogous to that when you do
`MyModel.objects.create()` it doesn't change an already executed queryset.
There's a question of where you draw the line as well - what about
`related_set.update()`?

I think it's worth documenting the behaviour, also noting that you can
"force" the execution of a new queryset by chaining another .all().

On 7 June 2016 at 13:26, Yoong Kang Lim  wrote:

> Hi all, I'd like to bring up ticket #26706:
> https://code.djangoproject.com/ticket/26706
>
> Related managers have methods such as add(), change() and remove() that
> change database objects. When a prefetch_related is done prior to calling
> these methods, it does not clear the cache. When the related field is
> accessed, it returns the cached result instead of the updated result. A
> couple of tickets have been opened, as this does seem to be surprising
> behaviour.
>
> I was working on a patch to address this, but Tim brought up some concerns
> about backward compatibility regarding the change and directed me here to
> get some community consensus. The change I'm proposing will clear the cache
> (for the prefetched field) when any of the methods are called. If we
> introduce this, it will be a backwards-incompatible change, so I'd just
> like to get some opinions on what the best way forward would be. Obviously
> in either case the behaviour should be documented.
>
> Also a thought just occurred to me -- if we don't put this change in,
> could we, as an alternative solution, extend the API to let the user decide
> what to do with the cache? Maybe something like
> clear_prefetched_field(related_field_name) on the manager so that at least
> the user has a choice instead of running the query (although the trouble
> they would need to go through would be similar, IMO).
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/eabd9567-53e3-413b-9b30-dbcfbf9c2634%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMwjO1EkD_gfr6rGrFe%3DGY5YRXCs5qYoKYmw8zST94GLJOHECA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Django and Tika, API

2016-06-07 Thread Florian Apolloner
Hi,

this mailing list is for the development of django itself. Please write to 
django-users instead.

Cheers,
Florian 

On Tuesday, June 7, 2016 at 2:25:53 PM UTC+2, Allison A. wrote:
>
> I am trying to extract a plain text from pdfs using Apache Tika. I can use 
> a python binding, python-tika, but somehow I am not sure it's an efficient 
> way as some files can come up more than 25M. 
>
> What I need is the extract text instead of sending the files themselves to 
> the server side. The best scenario would be 'extract on the client using 
> Tika and send that plain text to the server/Django'. How would i implement 
> this?
>
> Thanks, 
>
> Ali
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3f8a22c0-ddfc-463a-98da-038114154df5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Clearing prefetch related on add(), change(), remove()

2016-06-07 Thread Yoong Kang Lim
Hi all, I'd like to bring up ticket 
#26706: https://code.djangoproject.com/ticket/26706

Related managers have methods such as add(), change() and remove() that 
change database objects. When a prefetch_related is done prior to calling 
these methods, it does not clear the cache. When the related field is 
accessed, it returns the cached result instead of the updated result. A 
couple of tickets have been opened, as this does seem to be surprising 
behaviour.

I was working on a patch to address this, but Tim brought up some concerns 
about backward compatibility regarding the change and directed me here to 
get some community consensus. The change I'm proposing will clear the cache 
(for the prefetched field) when any of the methods are called. If we 
introduce this, it will be a backwards-incompatible change, so I'd just 
like to get some opinions on what the best way forward would be. Obviously 
in either case the behaviour should be documented. 

Also a thought just occurred to me -- if we don't put this change in, could 
we, as an alternative solution, extend the API to let the user decide what 
to do with the cache? Maybe something like 
clear_prefetched_field(related_field_name) on the manager so that at least 
the user has a choice instead of running the query (although the trouble 
they would need to go through would be similar, IMO).

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/eabd9567-53e3-413b-9b30-dbcfbf9c2634%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.