True in opensuse repository there are two possibilities 'src' and 'nosrc' (this one should be legacy without source code), both are recognized by createrepo_c as arch 'src'.
To point the pulp2 code I mentioned I found here [0] (base rpm package what I understood). The rise of error in pulp3 happening here [1] in pulpcore when adding packages to repository version. So as Ina mentioned it doesn't have to be an issue with packages itself than the logic in sync. [0] https://github.com/pulp/pulp_rpm/blob/2-master/plugins/pulp_rpm/plugins/db/models.py#L779 [1] https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L570 On Wed, Mar 18, 2020 at 1:55 PM Ina Panova <ipan...@redhat.com> wrote: > Tanya and Pavel, > in this issue it is explained why we cannot keep 2 packages with same > NEVRA but different checksums within a repo > https://pulp.plan.io/issues/494 > > Pulp2 had a limitation where it was not able to save on the filesystem 2 > rpms with same filename, it lead to the primary.xml that could have pointed > to the rpm that did not actually get saved. > I believe in Pulp3 we could allow having rpm with same NEVRA if they have > different location_href within a repo. > > -------- > Regards, > > Ina Panova > Senior Software Engineer| Pulp| Red Hat Inc. > > "Do not go where the path may lead, > go instead where there is no path and leave a trail." > > > On Wed, Mar 18, 2020 at 10:47 AM Tatiana Tereshchenko <ttere...@redhat.com> > wrote: > >> Hi Pavel, >> >> On Tue, Mar 17, 2020 at 7:31 PM Pavel Picka <ppi...@redhat.com> wrote: >> >>> Hello, would like to ask you how to proceed with issue with duplicate >>> (but not really) packages. >>> >>> I am syncing suse repository (opensuse42 and SLE12) and get and >>> duplicate error. But when checking the packages [0](from primary.xml) glibc >>> and glibc they got same nevra but different checksum (and a few more as >>> size..) so doesn't look like real duplicates. >>> >> Those are weird, the have the same nevra but see the location_href, one >> is src and the other one is nosrc! :/ : >> <location href="nosrc/glibc-2.19-20.3.nosrc.rpm"/> >> <location href="src/glibc-2.19-20.3.src.rpm"/> >> >> It looks like something OpenSUSE specific. I'm not sure if it's a valid >> way to create a repo with such metadata, we need to figure it out at some >> point. >> >> >>> I've checked Pulp2 and there is used nevra+sum for repository >>> uniqueness. In pulp3 we use only nevra. >>> >> Why do you think that in pulp 2 we use NEVRA + checksum? have you tested >> it? please point to the code. >> I believe in Pulp 2 as well as in Pulp 3 we allow to have packages with >> different checksums in Pulp storage. >> I don't think we allow having the same packages with different checksums >> in the same repo. >> FWIW, in pulp 2 the most recently added package is chosen to stay in a >> repo, no packages with duplicate NEVRA left after sync, see >> https://github.com/pulp/pulp_rpm/blob/2-master/plugins/pulp_rpm/plugins/importers/yum/purge.py#L285-L333 >> >> >>> >>> My suggestion is to extend repo_key_fields for rpm package as is in >>> pulp2 with pkgId (checksum). As I don't think they are really duplicates >>> and other software can rely on specific version of package. >>> >> >> Unfortunately, I don't remember the main reason to remove duplicates >> based on nevra. Was it because some tooling will complain, or was it just >> to avoid duplicates at resync time? Does anyone know? >> We should not change it unless we know for sure that it's needed + we >> would need to have an agreement from all our stakeholders for that change. >> >> For now, I think we can move on and ensure that no duplicates are in a >> repo version. To my understanding, the behaviour will be the same as in >> pulp 2. >> Feel free to share where you get duplicate error to see if it's a bug or >> not. I wonder why duplicates are not removed automatically. Maybe because >> the first version contains duplicates due to this bug >> https://pulp.plan.io/issues/6217 ? >> >> Tanya >> >> >>> >>> What do you think? >>> >>> >>> [0] >>> >>>> <package type="rpm"> >>>> <name>glibc</name> >>>> <arch>src</arch> >>>> <version epoch="0" ver="2.19" rel="20.3"/> >>>> <checksum type="sha256" >>>> pkgid="YES">00d36c0f741b0c01a77ce318a2bbcfa59cb4dd0b24ce61f57c6205e4fa1bb310</checksum> >>>> <summary>Standard Shared Libraries (from the GNU C Library)</summary> >>>> <description>The GNU C Library provides the most important standard >>>> libraries used >>>> by nearly all programs: the standard C library, the standard math >>>> library, and the POSIX thread library. A system is not functional >>>> without these libraries.</description> >>>> <packager>https://www.suse.com/</packager> >>>> <url>http://www.gnu.org/software/libc/libc.html</url> >>>> <time file="1426696882" build="1425645307"/> >>>> <size package="591662" installed="13047428" archive="974464"/> >>>> <location href="nosrc/glibc-2.19-20.3.nosrc.rpm"/> >>>> <format> >>>> <rpm:license>LGPL-2.1+ and SUSE-LGPL-2.1+-with-GCC-exception and >>>> GPL-2.0+</rpm:license> >>>> <rpm:vendor>SUSE LLC <https://www.suse.com/></rpm:vendor> >>>> <rpm:group>System/Libraries</rpm:group> >>>> <rpm:buildhost>sheep16</rpm:buildhost> >>>> <rpm:sourcerpm/> >>>> <rpm:header-range start="872" end="144403"/> >>>> <rpm:requires> >>>> <rpm:entry name="pwdutils"/> >>>> <rpm:entry name="xz"/> >>>> <rpm:entry name="fdupes"/> >>>> <rpm:entry name="systemd-rpm-macros"/> >>>> <rpm:entry name="libselinux-devel"/> >>>> <rpm:entry name="makeinfo"/> >>>> </rpm:requires> >>>> </format> >>>> </package> >>>> >>>> <package type="rpm"> >>>> <name>glibc</name> >>>> <arch>src</arch> >>>> <version epoch="0" ver="2.19" rel="20.3"/> >>>> <checksum type="sha256" >>>> pkgid="YES">353e1dc85eab8d434be83160eca4fcee11a72eec345385df125ca0835abd6068</checksum> >>>> <summary>Standard Shared Libraries (from the GNU C Library)</summary> >>>> <description>The GNU C Library provides the most important standard >>>> libraries used >>>> by nearly all programs: the standard C library, the standard math >>>> library, and the POSIX thread library. A system is not functional >>>> without these libraries.</description> >>>> <packager>https://www.suse.com/</packager> >>>> <url>http://www.gnu.org/software/libc/libc.html</url> >>>> <time file="1426696883" build="1423750734"/> >>>> <size package="12678975" installed="13047285" archive="13057760"/> >>>> <location href="src/glibc-2.19-20.3.src.rpm"/> >>>> <format> >>>> <rpm:license>LGPL-2.1+ and SUSE-LGPL-2.1+-with-GCC-exception and >>>> GPL-2.0+</rpm:license> >>>> <rpm:vendor>SUSE LLC <https://www.suse.com/></rpm:vendor> >>>> <rpm:group>System/Libraries</rpm:group> >>>> <rpm:buildhost>sheep02</rpm:buildhost> >>>> <rpm:sourcerpm/> >>>> <rpm:header-range start="872" end="144334"/> >>>> <rpm:requires> >>>> <rpm:entry name="pwdutils"/> >>>> <rpm:entry name="xz"/> >>>> <rpm:entry name="fdupes"/> >>>> <rpm:entry name="systemd-rpm-macros"/> >>>> <rpm:entry name="libselinux-devel"/> >>>> <rpm:entry name="makeinfo"/> >>>> </rpm:requires> >>>> </format> >>>> </package> >>> >>> >>> -- >>> Pavel Picka >>> Red Hat >>> _______________________________________________ >>> Pulp-dev mailing list >>> Pulp-dev@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >> _______________________________________________ >> Pulp-dev mailing list >> Pulp-dev@redhat.com >> https://www.redhat.com/mailman/listinfo/pulp-dev >> > -- Pavel Picka Red Hat
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev