Re: UDD data - MySQL - Was: Re: Archiving UDD data : FLOSSMole ?
On Tue, Jun 16, 2009 at 11:22:10PM +0200, Olivier Berger wrote: I think it depends the kind of analysis people might want to do for their research in mining the Debian metadata that would be archived in FLOSSMole, maybe these package versions wouldn't be needed, so strings without the full arithmetics of versions would be enough ? In any case, the first idea is to archive data so that it's there for history, I suppose, then the users will complain eventually, and we shall see ;) Exactly. I see no problem in archiving it to MySQL, injecting versions as string. That would just mean that you can't run the exactly same query you run now, but you can still re-inject historical versions elsewhere and/or use some MySQL extensions to obtain the same effect. Cheers. -- Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7 z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/ Dietro un grande uomo c'è ..| . |. Et ne m'en veux pas si je te tutoie sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime signature.asc Description: Digital signature
Re: UDD data - MySQL - Was: Re: Archiving UDD data : FLOSSMole ?
On Wed, Jun 10, 2009 at 05:59:21PM +0200, Olivier Berger wrote: Actually, something obvious seems to render things a bit difficult at first sight : UDD is in PostGres, and FLOSSMole uses MySQL. We're discussing some options on the ossmole-discuss list to overcome this difficulty. My advice is to import in PG on a Debian system with the required dependencies, create views to be able to convert non standard types to standard ones, then dump again to SQL, and import to MySQL... any comments ? UDD has some parts that are Postgres specific, in particular it uses the Postgres extension mechanism to internalize Debian version comparison so that it can be exploited in queries. In the beginning, the internalization was done only as an extension function, it might be that now there is even a custom data type where you can use and such, but I'm not sure about that (redirect to Lucas). You just need to pay attention (and maybe do some tests) that, migrating away from Postgres, you might loose the ability to reuse the same queries which you can currently use with udd.d.o. Cheers. -- Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7 z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/ Dietro un grande uomo c'è ..| . |. Et ne m'en veux pas si je te tutoie sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime signature.asc Description: Digital signature
Re: UDD data - MySQL - Was: Re: Archiving UDD data : FLOSSMole ?
On Tue, Jun 16, 2009 at 09:49:27AM +0200, Stefano Zacchiroli wrote: UDD has some parts that are Postgres specific, in particular it uses the Postgres extension mechanism to internalize Debian version comparison so that it can be exploited in queries. In the beginning, the internalization was done only as an extension function, it might be that now there is even a custom data type where you can use and such, but I'm not sure about that (redirect to Lucas). I can confirm that there is a new data type debversion: udd= \d packages Table public.packages Column|Type| Modifiers -++--- package | text | not null version | debversion | not null architecture| text | not null maintainer | text | description | text | long_description| text | source | text | source_version | debversion | ... which is probably hard to reimplement in MySQL. There was a thread here on this list about this data type and its implementation. You just need to pay attention (and maybe do some tests) that, migrating away from Postgres, you might loose the ability to reuse the same queries which you can currently use with udd.d.o. IMHO it is not a good idea to try to port UDD to a different database engine. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: UDD data - MySQL - Was: Re: Archiving UDD data : FLOSSMole ?
Le mardi 16 juin 2009 à 12:50 +0200, Andreas Tille a écrit : On Tue, Jun 16, 2009 at 09:49:27AM +0200, Stefano Zacchiroli wrote: UDD has some parts that are Postgres specific, in particular it uses the Postgres extension mechanism to internalize Debian version comparison so that it can be exploited in queries. In the beginning, the internalization was done only as an extension function, it might be that now there is even a custom data type where you can use and such, but I'm not sure about that (redirect to Lucas). I can confirm that there is a new data type debversion: SNIP which is probably hard to reimplement in MySQL. There was a thread here on this list about this data type and its implementation. Thanks for these comments. That's what I had also indentified as PostGres dependant when we discussed the subject on the ossmole-discuss list. You just need to pay attention (and maybe do some tests) that, migrating away from Postgres, you might loose the ability to reuse the same queries which you can currently use with udd.d.o. IMHO it is not a good idea to try to port UDD to a different database engine. I think it depends the kind of analysis people might want to do for their research in mining the Debian metadata that would be archived in FLOSSMole, maybe these package versions wouldn't be needed, so strings without the full arithmetics of versions would be enough ? In any case, the first idea is to archive data so that it's there for history, I suppose, then the users will complain eventually, and we shall see ;) Thanks again. Best regards, -- Olivier BERGER olivier.ber...@it-sudparis.eu http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 1024D/6B829EEC Ingénieur Recherche - Dept INF Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France) -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
UDD data - MySQL - Was: Re: Archiving UDD data : FLOSSMole ?
Hi. Le mardi 09 juin 2009 à 16:55 +0200, Olivier Berger a écrit : It's very likely then, that UDD will start being collected by the FLOSSMole team in the future. Actually, something obvious seems to render things a bit difficult at first sight : UDD is in PostGres, and FLOSSMole uses MySQL. We're discussing some options on the ossmole-discuss list to overcome this difficulty. My advice is to import in PG on a Debian system with the required dependencies, create views to be able to convert non standard types to standard ones, then dump again to SQL, and import to MySQL... any comments ? Thanks in advance. Best regards, -- Olivier BERGER olivier.ber...@it-sudparis.eu http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 1024D/6B829EEC Ingénieur Recherche - Dept INF Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France) -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Archiving UDD data : FLOSSMole ?
Le lundi 08 juin 2009 à 14:35 +0200, Olivier Berger a écrit : Le dimanche 07 juin 2009 à 08:43 +0200, Stefano Zacchiroli a écrit : Even though it will not solve that problem, I would very welcome injecting periodically data from UDD to the FLOSSMole guys. The simplest way, currently, seems to be to handing hover the snapshot we periodically release on the web. Have you already asked them if that kind of data would be suitable for them? More or less, orally, and at least Megan Squire was quite receptive (I actually wrote my mail during her talk's Q As). I will try and post to their list and report here. I'm not exactly sure of all the constraints involved with archiving some data at their place, but will try and act as a facilitator, and report when I have more details, letting you probably discuss further details ;) Following-up on this, you'll find the thread around http://sourceforge.net/mailarchive/forum.php?thread_name=94d3bd120906090719p3b1000f2u9d1e160cd1fe064a%40mail.gmail.comforum_name=ossmole-discuss where the idea seems accepted. It's very likely then, that UDD will start being collected by the FLOSSMole team in the future. I'm staying subscribed both to debian-qa and flossmole-discuss lists to try and proxy as much information as would be possible, but feel free to take direct contacts if you see fit ;) Best regards, -- Olivier BERGER olivier.ber...@it-sudparis.eu http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 1024D/6B829EEC Ingénieur Recherche - Dept INF Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France) -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Archiving UDD data : FLOSSMole ?
Le dimanche 07 juin 2009 à 08:43 +0200, Stefano Zacchiroli a écrit : Even though it will not solve that problem, I would very welcome injecting periodically data from UDD to the FLOSSMole guys. The simplest way, currently, seems to be to handing hover the snapshot we periodically release on the web. Have you already asked them if that kind of data would be suitable for them? More or less, orally, and at least Megan Squire was quite receptive (I actually wrote my mail during her talk's Q As). I will try and post to their list and report here. I'm not exactly sure of all the constraints involved with archiving some data at their place, but will try and act as a facilitator, and report when I have more details, letting you probably discuss further details ;) Btw, I'm not sure if/how we can plan some UDD-related discussions @debconf... not having participated to debconf earlier... I suppose we'll manage to find a place with cervesa to do so in last extremity ;) Best regards, -- Olivier BERGER olivier.ber...@it-sudparis.eu http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 1024D/6B829EEC Ingénieur Recherche - Dept INF Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France) -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Archiving UDD data : FLOSSMole ?
On 08/06/09 at 14:35 +0200, Olivier Berger wrote: Btw, I'm not sure if/how we can plan some UDD-related discussions @debconf... not having participated to debconf earlier... I suppose we'll manage to find a place with cervesa to do so in last extremity ;) Hi, I proposed a UDD talk at debconf, which was accepted. So this could be an opportunity to find people interested in UDD, and then have offline discussions. However, I'm not 100% sure yet that I will come to Debconf, as I might need the time to move to another city if I choose the position I'm being offered there. Also, sorry again for not being able to respond quickly to all UDD-related emails and questions: I'm currently kept totally busy by real life stuff. I realize it is very bad timing, but there's not much I can do :( -- | Lucas Nussbaum | lu...@lucas-nussbaum.net http://www.lucas-nussbaum.net/ | | jabber: lu...@nussbaum.fr GPG: 1024D/023B3F4F | -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Archiving UDD data : FLOSSMole ?
On Mon, Jun 08, 2009 at 10:28:37PM +0200, Lucas Nussbaum wrote: On 08/06/09 at 14:35 +0200, Olivier Berger wrote: Btw, I'm not sure if/how we can plan some UDD-related discussions @debconf... not having participated to debconf earlier... I suppose we'll manage to find a place with cervesa to do so in last extremity ;) I proposed a UDD talk at debconf, which was accepted. So this could be an opportunity to find people interested in UDD, and then have offline discussions. I'm definitely interested and my talk about the usage of UDD in Debian Pure Blends will be also UDD related. I'd propose an ad hoc BOF for those who are interested. However, I'm not 100% sure yet that I will come to Debconf, as I might need the time to move to another city if I choose the position I'm being offered there. Would be a shame to miss you. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Archiving UDD data : FLOSSMole ?
On Sat, Jun 06, 2009 at 12:55:39PM +0200, Olivier Berger wrote: I think maybe it would be interesting to offer some data from UDD to the FLOSSMole research archive at : http://ossmole.sourceforge.net/ (and http://code.google.com/p/flossmole/ ). By offering, I mean there may be some need of synchronisation with the researchers for technical matters (or others : anonymization, etc.)... Very interesting! thanks for the heads up, as I wasn't aware of that project in spite of having need in the past such kind of historical data. Actually, during the design of UDD, I've pushed a bit to have *some* historical data, in particular I was interested in having all the upload history. Back then however, it turned out that it was too heavy to have that naively in the database itself. Even though it will not solve that problem, I would very welcome injecting periodically data from UDD to the FLOSSMole guys. The simplest way, currently, seems to be to handing hover the snapshot we periodically release on the web. Have you already asked them if that kind of data would be suitable for them? Any comments ? I hope to meet some of you at Debconf to talk to you about that. I'll be there, and I'll very happily join a discussion about this. Cheers. -- Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7 z...@{upsilon.cc,pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/ Dietro un grande uomo c'è ..| . |. Et ne m'en veux pas si je te tutoie sempre uno zaino ...| ..: | Je dis tu à tous ceux que j'aime signature.asc Description: Digital signature