Rohit Sarkar <rohitsarkar5...@gmail.com> writes:

> Hey,
> On Wed, Jun 17, 2020 at 07:55:37AM +1000, Daniel Axtens wrote:
>> Rohit Sarkar <rohitsarkar5...@gmail.com> writes:
>> 
>> > Hey,
>> > On Fri, Jun 12, 2020 at 12:04:54AM +0530, Rohit Sarkar wrote:
>> >> The parserelations command will be used to bulk import patch relations 
>> >> into
>> >> Patchwork using a "patch groups" file.
>> >> The patch groups file is structured such that each line contains a
>> >> relation.
>> >> Eg patch groups file contents.
>> >> 1 3 4
>> >> 2
>> >> 5 9 10
>> >> 
>> >> In this case 2 relations will be ingested into Patchwork, (1,3,4) and
>> >> (5,9,10). Further group (5,9,10) also points to two upstream commit
>> >> references.
>> >> Note that before ingesting the relations the existing relations in the
>> >> DB are removed.
>> >> 
>> >> v1..v2
>> >> - remove commit references from the patch groups file.
>> >> - Update documentation
>> >> - rename some variables to remove the overloading.
>> >> - use filter and update instead of get and save to reduce the db calls.
>> >>   (Visible performance enhancement)
>> >> 
>> >> Leaving the copyright untouched till Ralf and Lukas comment on how to
>> >> proceed.
>> >> 
>> >> Rohit Sarkar (1):
>> >>   management: introduce parserelations command
>> >> 
>> >>  docs/deployment/management.rst                | 26 +++++++
>> >>  .../management/commands/parserelations.py     | 71 +++++++++++++++++++
>> >>  patchwork/tests/test_management.py            |  7 ++
>> >>  3 files changed, 104 insertions(+)
>> >>  create mode 100644 patchwork/management/commands/parserelations.py
>> >> 
>> >> -- 
>> >> 2.23.0.385.gbc12974a89
>> >> 
>> >
>> > Just wanted to follow up on this. Does this look good?
>> 
>> I was thinking about this as I washed the dishes last night.
>> 
>> Purging all relations in the database means that if any API-based user
>> of relations emerges before the big public servers adopt a release with
>> this, then they'll never be able to use it.
>> 
>> What's the speed impact of doing two passes through the data, with pass
>> 1 collecting all the projects that this touches and pass 2 actually
>> installing the relations?
>> 
>> Regards,
>> Daniel
> I wanted to reopen the discussion on this with the aim being to get
> everyone on the same page, and either go with this approach or decide on
> an alternate approach.
>
> The primary goal is to have a "parserelations" command that can parse a
> "patch-groups" file that contains a relation on each line. There are 2
> approaches:
>
> a. Refresh the relations table. The parserelations command will remove
> the existing relations before inserting the newly parsed relations. This
> avoids any conflicts between existing and new relations. However other
> users of the API might be affected.
>
> b. Insert relations in a non destructive way. We need a way to handle
> conflicts between existing and incoming relations. So the solution is
> not really clear in this case. What should have precedence in the case
> of conflicts? The parserelations script or the existing relations.
>


Right, sorry for dropping this, work got a bit crazy.

It occurs to me that I have an embedded assumption that may not be true
here. Are you trying to build a set of relations that includes
cross-project relations, or are you just trying to build a set of
relations entirely or mostly within a single patchwork project?

I've been assuming that your relations are entirely or predominantly
intra-project, but I'm starting to suspect that maybe my understanding
is not right.

If you are expecting a non-trivial quantity of cross-project relations,
then my suggested approach earlier (caller of the script specifies the
projects that patches may belong to) doesn't make sense. It probably
makes most sense to go with approach (a) in that case - it's really
difficult to figure out how to do (b) in a way that preserves the
meaning that any other API user, or the PaStA tool, is trying to create.

If we do go down route (a), I want to make it really hard for users to
accidentally shoot themselves in the foot here and wipe out data. So
probably I would say that parserelations should only insert relations if
the table is empty when the script begins, and there should be a
separate command that with a suitably scary name that purges all
existing relations.


Have I correctly understood things here?

Regards,
Daniel




>
> Thanks,
> Rohit
>> > Thanks,
>> > Rohit
_______________________________________________
Patchwork mailing list
Patchwork@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/patchwork

Reply via email to