Re: Dealing with renamed source packages during CVE triaging
Antoine Beaupré writes: > bam: do you want me to start working on that script or were you working > on this already? See https://salsa.debian.org/security-tracker-team/security-tracker/merge_requests/8 I personally find this easier to understand as we use the existing CVE list parser, although I have not considered how to write changes (as this wasn't a requirement when I wrote this). -- Brian May
Re: Dealing with renamed source packages during CVE triaging
On 2018-06-15 10:27:45, Moritz Muehlenhoff wrote: > On Fri, Jun 15, 2018 at 04:34:14PM +1000, Brian May wrote: >> Moritz Muehlenhoff writes: >> >> > On Wed, Jun 13, 2018 at 05:19:40PM +1000, Brian May wrote: [...] >> That generates a report of all packages that we need to check. I assume >> we would need some way of marking packages that we have checked and >> found to be not affected, so we can get a list of packages that need >> immediate attention and don't repeatedly check the same package multiple >> times. How should we do this? Maybe another file in the security tracker >> repository? > > Maybe start with the script initially and see whether it's useful as an > approach in general. State tracking can be discussed/added later. Maybe the same principle applies as with the approach I considered. We could have a --stop argument that would consider entries up to a certain CVE number and ignore the rest of the file. > Lots of the false positives will result from crappy/outdated entries > in embedded-code-copies, so fixing those up will drastically reduce > false positives. If the embedded-code-copies is used more systematically, with a semi-automated script, in the triaging process, we'll be more inclined to keep it up to date as well so I think it would actually help with that as well... bam: do you want me to start working on that script or were you working on this already? Thanks for the feedback, A. -- Ils versent un pauvre miel sur leurs mots pourris et te parlent de pénurie Et sur ta faim, sur tes amis, ils aiguisent leur appétit - Richard Desjardins, La maison est ouverte
Re: Dealing with renamed source packages during CVE triaging
Brian May writes: > I will look at making a pull request tomorrow. The changes should be > reasonably straight forward syntax changes (e.g. use "!=" instead of > "<>" for the does not equal operator), work with Python3 in stretch, and > not require any additional dependancies (I think it only depends on > Python3). Python3 support: https://salsa.debian.org/security-tracker-team/security-tracker/merge_requests/7 This one implements bin/list-potential-packages-affected-by-code-copies: https://salsa.debian.org/security-tracker-team/security-tracker/merge_requests/8 At present time I have written this one to work with Python 2.7 and Python 3.6, but it won't work with Python 3.6 without the other pull request first -- Brian May
Re: Dealing with renamed source packages during CVE triaging
Salvatore Bonaccorso writes: >> Feel free to make a pull request, I don't think we have a specific >> dependency >> on Python 2 modules anywhere. But it might take a bit to get >> reviewed/deployed >> as it's not a high priority issue. > > To be kept in mind: whatever change is proposed for the code part of > the security tracker needs potentially to be able to run on the > security-tracker host soriano (running on stretch), preferably without > introducing new dependencies if they are not needed. Merge/pull requests > for those parts are preferred. I will look at making a pull request tomorrow. The changes should be reasonably straight forward syntax changes (e.g. use "!=" instead of "<>" for the does not equal operator), work with Python3 in stretch, and not require any additional dependancies (I think it only depends on Python3). Perhaps the most intrusive change is deleting the py file with the definition of namedtuple, it is not needed now Python has the collections module with a built in namedtuple. -- Brian May
Re: Dealing with renamed source packages during CVE triaging
Hi, On Fri, Jun 15, 2018 at 10:23:15AM +0200, Moritz Muehlenhoff wrote: > On Fri, Jun 15, 2018 at 05:21:55PM +1000, Brian May wrote: > > Brian May writes: > > > > > So we could write a script, lets say: > > > bin/list-potential-packages-affected-by-code-copies > > > > In investigating the possibility of this, I noticed the scripts in > > lib/python/sectracker use legacy python coding standards. > > > > I have updated these files on my local box to work with Python 3, but > > refraining from pushing for now, because of the possibilty I might break > > something important. > > When the Debian Security Tracker was created, Python 3 didn't even exist > yet :-) > > Feel free to make a pull request, I don't think we have a specific dependency > on Python 2 modules anywhere. But it might take a bit to get reviewed/deployed > as it's not a high priority issue. To be kept in mind: whatever change is proposed for the code part of the security tracker needs potentially to be able to run on the security-tracker host soriano (running on stretch), preferably without introducing new dependencies if they are not needed. Merge/pull requests for those parts are preferred. Regards, Salvatore
Re: Dealing with renamed source packages during CVE triaging
On Fri, Jun 15, 2018 at 04:34:14PM +1000, Brian May wrote: > Moritz Muehlenhoff writes: > > > On Wed, Jun 13, 2018 at 05:19:40PM +1000, Brian May wrote: > >> "as I said in the mailing list discussion, I don't like the usage of the > >> undetermined tag... we use it to hide stuff we can't investigate under > >> the carpet, I would much prefer that we put it as directly > >> when it's the case, or otherwise." > > > > Of course, those can be resolved; it just needs someone to do the analysis > > work. > > Switching to some other tags (and incorrect ones!) doesn't change anything. > > Seems like this a mute point anyway, as from the comments you left in > the pull request, you don't like this approach of automatically adding > entries in data/CVE/list. Fair enough. > > So we could write a script, lets say: > bin/list-potential-packages-affected-by-code-copies You're mixing two things; my comment above refers to , those are one-off investigations and don't need any particular tooling. > That generates a report of all packages that we need to check. I assume > we would need some way of marking packages that we have checked and > found to be not affected, so we can get a list of packages that need > immediate attention and don't repeatedly check the same package multiple > times. How should we do this? Maybe another file in the security tracker > repository? Maybe start with the script initially and see whether it's useful as an approach in general. State tracking can be discussed/added later. Lots of the false positives will result from crappy/outdated entries in embedded-code-copies, so fixing those up will drastically reduce false positives. Cheers, Moritz
Re: Dealing with renamed source packages during CVE triaging
On Fri, Jun 15, 2018 at 05:21:55PM +1000, Brian May wrote: > Brian May writes: > > > So we could write a script, lets say: > > bin/list-potential-packages-affected-by-code-copies > > In investigating the possibility of this, I noticed the scripts in > lib/python/sectracker use legacy python coding standards. > > I have updated these files on my local box to work with Python 3, but > refraining from pushing for now, because of the possibilty I might break > something important. When the Debian Security Tracker was created, Python 3 didn't even exist yet :-) Feel free to make a pull request, I don't think we have a specific dependency on Python 2 modules anywhere. But it might take a bit to get reviewed/deployed as it's not a high priority issue. Cheers, Moritz
Re: Dealing with renamed source packages during CVE triaging
Brian May writes: > So we could write a script, lets say: > bin/list-potential-packages-affected-by-code-copies In investigating the possibility of this, I noticed the scripts in lib/python/sectracker use legacy python coding standards. I have updated these files on my local box to work with Python 3, but refraining from pushing for now, because of the possibilty I might break something important. Is Python 2 compatability still required? -- Brian May
Re: Dealing with renamed source packages during CVE triaging
Moritz Muehlenhoff writes: > On Wed, Jun 13, 2018 at 05:19:40PM +1000, Brian May wrote: >> "as I said in the mailing list discussion, I don't like the usage of the >> undetermined tag... we use it to hide stuff we can't investigate under >> the carpet, I would much prefer that we put it as directly >> when it's the case, or otherwise." > > Of course, those can be resolved; it just needs someone to do the analysis > work. > Switching to some other tags (and incorrect ones!) doesn't change anything. Seems like this a mute point anyway, as from the comments you left in the pull request, you don't like this approach of automatically adding entries in data/CVE/list. Fair enough. So we could write a script, lets say: bin/list-potential-packages-affected-by-code-copies That generates a report of all packages that we need to check. I assume we would need some way of marking packages that we have checked and found to be not affected, so we can get a list of packages that need immediate attention and don't repeatedly check the same package multiple times. How should we do this? Maybe another file in the security tracker repository? Would anybody object to this approach? -- Brian May
Re: Dealing with renamed source packages during CVE triaging
On Wed, Jun 13, 2018 at 05:19:40PM +1000, Brian May wrote: > "as I said in the mailing list discussion, I don't like the usage of the > undetermined tag... we use it to hide stuff we can't investigate under > the carpet, I would much prefer that we put it as directly > when it's the case, or otherwise." Of course, those can be resolved; it just needs someone to do the analysis work. Switching to some other tags (and incorrect ones!) doesn't change anything. Cheers, Moritz
Re: Dealing with renamed source packages during CVE triaging
Antoine Beaupré writes: > https://salsa.debian.org/security-tracker-team/security-tracker/merge_requests/4 > > Comments are welcome there or here. Current comments on merge request, copied and pasted here, as I think relevant for the discussion here: Moritz Muehlenhoff @jmm commented 4 days ago Owner Strong nack, the data quality of embedded code copies isn't useful for this. When you've verified a certain package to be affected, add it manually (with references), but don't dump lots of unactionable data into the tracker. Brian May @bam commented 2 minutes ago Developer @jmm The problem I believe is how do we keep track of packages that might be affected but aren't listed in the security tracker? Do we maybe need to keep track of this information outside the security tracker? -- Brian May
Re: Dealing with renamed source packages during CVE triaging
Brian May writes: > In any case, possibly better to leave feedback on the pull request: s/pull request/issue/ Sorry for any confusion. -- Brian May
Re: Dealing with renamed source packages during CVE triaging
Moritz Muehlenhoff writes: > On Tue, Jun 12, 2018 at 05:40:34PM +1000, Brian May wrote: >> 1. Tagging with / instead of . > > Nothing of those can automated. The basic point of is that > we lack data to make a proper assessment. > > The correct way to handle these is to triage > https://security-tracker.debian.org/tracker/status/undetermined by contacting > e.g. upstream developers or the reporters of the vulnerability and then amend > CVE/list with the necessary information, i.e. either converting them to > if it has been confirmed to be an issue or to > . >From an email sent to a Freexian list: "as I said in the mailing list discussion, I don't like the usage of the undetermined tag... we use it to hide stuff we can't investigate under the carpet, I would much prefer that we put it as directly when it's the case, or otherwise." Having said that, not sure I personally understand this concern. It would simplify things if we could just use . >> 3. Resolve general issue regarding CVE/list, and if it should be split up. > > That has been proposed and nacked several times before. There's simply > no practical reason for it. It would add multiple complications (starting > with the MITRE sync, syncing with external parties, changes to the tracker) > for no measurable gain. Quite the contrary; it's extremely useful to have > 20 years of vulnerability data easily available in a single emacs buffer. The concerns (from reading the PR) were that: * git can't cope efficiently with such large files. * emacs can't cope efficiently with such large files. In any case, possibly better to leave feedback on the pull request: https://salsa.debian.org/security-tracker-team/security-tracker/issues/2 -- Brian May
Re: Dealing with renamed source packages during CVE triaging
On Tue, Jun 12, 2018 at 05:40:34PM +1000, Brian May wrote: > 1. Tagging with / instead of . Nothing of those can automated. The basic point of is that we lack data to make a proper assessment. The correct way to handle these is to triage https://security-tracker.debian.org/tracker/status/undetermined by contacting e.g. upstream developers or the reporters of the vulnerability and then amend CVE/list with the necessary information, i.e. either converting them to if it has been confirmed to be an issue or to . > 3. Resolve general issue regarding CVE/list, and if it should be split up. That has been proposed and nacked several times before. There's simply no practical reason for it. It would add multiple complications (starting with the MITRE sync, syncing with external parties, changes to the tracker) for no measurable gain. Quite the contrary; it's extremely useful to have 20 years of vulnerability data easily available in a single emacs buffer. Cheers, Moritz
Re: Dealing with renamed source packages during CVE triaging
Antoine Beaupré writes: > I've finalized a prototype during my research on this problem, which I > have detailed on GitLab, as it's really code that should be merged. It > would also benefit from wider attention considering it affects more than > LTS now. Anyways, the MR is here: > > https://salsa.debian.org/security-tracker-team/security-tracker/merge_requests/4 > > Comments are welcome there or here. > > For what it's worth, I reused Lamby's crude parser because I wanted to > get the prototype out the door. I am also uncertain that a full parser > can create the CVE/list file as is reliably without introducing > inconsistent diffs... > > I also drifted into the core datastructures of the security tracker, and > wondered if it would be better to split up our large CVE/list file now > that we're using git. I had mixed results. For those interested, it is > documented here: > > https://salsa.debian.org/security-tracker-team/security-tracker/issues/2 So if I understand correctly, the parts that aren't done yet are: 1. Tagging with / instead of . 2. Not processing old entries that we don't care about anymore. 3. Resolve general issue regarding CVE/list, and if it should be split up. For these: 1. We need to be able if the package still exists or not in a given distribution. This information is not available from the security-tacker database, we would need to get it using online json calls. For each and every package we look at. Which is likely to be very slow, although incremental processing might help (). 2. For incrememntal updates, coming up with a definition of old entries that is easy to check seems to be the stumbling point here. Particularly as entries in CVE/list can be created not in order, and old CVEs might still be very relevant. Maybe we need to create/update a list of all CVEs we have processed before? Would this work, or is there some problem I haven't thought of? Ideally for this to work properly we would also need to ensure that it updates all entries in one run, as one run would be all we get. Not multiple runs as can be the case now. 3. I have not noticed git operations being slow, but then again I don't often update this file. As a potential compromise, maybe instead of one file per CVE, one file per year? -- Brian May
Re: Dealing with renamed source packages during CVE triaging
I've finalized a prototype during my research on this problem, which I have detailed on GitLab, as it's really code that should be merged. It would also benefit from wider attention considering it affects more than LTS now. Anyways, the MR is here: https://salsa.debian.org/security-tracker-team/security-tracker/merge_requests/4 Comments are welcome there or here. For what it's worth, I reused Lamby's crude parser because I wanted to get the prototype out the door. I am also uncertain that a full parser can create the CVE/list file as is reliably without introducing inconsistent diffs... I also drifted into the core datastructures of the security tracker, and wondered if it would be better to split up our large CVE/list file now that we're using git. I had mixed results. For those interested, it is documented here: https://salsa.debian.org/security-tracker-team/security-tracker/issues/2 Cheers! a. -- If it's important for you, you'll find a way. If it's not, you'll find an excuse. - Unknown
Re: Dealing with renamed source packages during CVE triaging
On 2018-06-08 03:29:38, Brian May wrote: > Antoine Beaupré writes: > >> Right now, it seems that all scripts that hammer at those files do so >> with their own ad-hoc parsing code. Is that the recommended way of >> chopping those files up? Or is there a better parsing library out there? > > It sounds like we really good do with a good parsing library. Maybe one > that supports making changes too. > > I could make a start on this. As I mentioned in the other thread, I am uncertain where to go from here. Some scripts use JSON, others parse the files by hand... I also found out yesterday after writing this that there is *already* a parsing library in the security tracker. It can parse {CVE,DSA,DLA}/list files and lives in lib/python/bugs.py, but it's somewhat coupled with the sqlite database - i'm not sure it's usable standalone. But yeah, maybe clarifying all this stuff would help, for sure... I would recommend not writing yet another library from scratch however, as we probably have a dozen such parser already and it's confusing enough as it is. ;) a. -- L'ennui avec la grande famille humaine, c'est que tout le monde veut en être le père. - Mafalda
Re: Dealing with renamed source packages during CVE triaging
Antoine Beaupré writes: > Right now, it seems that all scripts that hammer at those files do so > with their own ad-hoc parsing code. Is that the recommended way of > chopping those files up? Or is there a better parsing library out there? It sounds like we really good do with a good parsing library. Maybe one that supports making changes too. I could make a start on this. Obligatory XKCD: https://xkcd.com/927/ -- Brian May
Re: Dealing with renamed source packages during CVE triaging
On Thu, Jun 07, 2018 at 06:07:24PM -0400, Antoine Beaupré wrote: > Sorry for resurrecting this old thread… No! I very much appreciate it when people keep issues in the back of their minds and keep thinking about them and keep reminding us "others" until they are solved properly! Thank you. :) -- cheers, Holger (and SCNR too) signature.asc Description: PGP signature
Re: Dealing with renamed source packages during CVE triaging
Sorry for resurrecting this old thread, but I've been looking at how to deal with renamed packages in CVE triaging again. When we last talked about this, we observed how we were sometimes missing packages during triage, e.g. `tiff3` that was present in wheezy. That's not an issue anymore since wheezy is gone, but the problem occurs more broadly in other packages. In fact, it seems to me this is similar to the broader of embedded code copies. We could generalize renamed packages to the embedded code copies problem. We have a database of those in data/embedded-code-copies already, although I'm not sure how up to date that file actually is, nor how it is currently used in the workflow. It seems to me any database of renames we could be would clearly overlap with the embedded-code-copies file, so I figured I would write a (Python, we already have Perl and bash ones...) to start with. I have tried to upload this in a fork on salsa but gave up as push (of a single commit!) was stuck "resolving deltas"... Anyways, here's the snippet: https://salsa.debian.org/anarcat/security-tracker/snippets/70 The next step is to figure out how to actually modify the data/CVE/list file to introduce the changes. Considering the large number of packages in the embedded-code-copies file, I am not sure we want to retroactively change all previous entries. jmm suggested we run a cronjob that would keep track of where it is in history which would resolve this nicely. One question that remains is what, exactly, to add in the CVE metadata. One problem we faced last we looked at this is that we needed to add an entry like: SOURCEPACKAGE ... which would (e.g.) get triaged to: SOURCEPACKAGE [wheezy] SOURCEPACKAGE (or whatever) ... later on. This requires inside knowledge of the suites and their packages, something I find surprisingly hard to do in the security tracker. With embedded-code-copies, we will have to add something for all the other source packages, e.g.: OTHERSOURCE Right now, it seems that all scripts that hammer at those files do so with their own ad-hoc parsing code. Is that the recommended way of chopping those files up? Or is there a better parsing library out there? Thanks for any advice, A.