Re: Reviving the hardware census

2017-11-28 Thread Jeremy Cline
On 11/26/2017 01:16 PM, Benson Muite wrote:
> Will one be able to opt out or easily choose what information is sent?
I imagine this will end up being opt-in rather than opt-out, but
regardless of that, it'll definitely be easy for the user to configure
what information they send.

> What happens to a database entry when one changes a device (eg. GPU or
> RAM upgrade)?
This depends on the database schema and reporting strategy. It's likely
that a random ID will be generated by the client on first use and sent
along with reports. Supposing the client reports every month, the next
report would include the new/changed devices.


-- 
Jeremy Cline
XMPP: jer...@jcline.org
IRC:  jcline

___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-28 Thread Jeremy Cline
On 11/10/2017 01:17 PM, Nathaniel McCallum wrote:
> The more I look at lshw, the more I'm ambivalent (I'm not against it,
> just not for it either). It certainly collects a lot of relevant
> information. However, I see the following problems.
> 
> 1. lshw tries to make things human readable. This is bad for
> databases. We want to record things like immutable numeric hardware
> IDs rather than the current contents of pci.ids. lshw does provide
> -numeric, but this just adds the numbers to the human readable
> strings.

That seems like it would be a good improvement for upstream.

> 2. lshw collects a lot of data we may not be interested in. For
> example, it reports the four different kinds of floppy disk drives my
> BIOS supports. Also, L1, L2 and L3 cache. There is a bunch of stuff
> like this. Is it useful? Some of it definitely is. But transferring
> unwanted data over the wire is not being a good netizen. Also, some
> people pay for data per MB transferred. We should be respectful.
> 
> 3. lshw supports '-sanitize', but this merely replaces the values with
> '[REMOVED]'. See above for why we don't want to transmit this data.
> 
> 4. lshw reports bus configuration. I'm not sure if this is relevant or
> how we would want to map this data. For example, if Dell uses the same
> hardware in a bunch of laptops then we can deduplicate this to one
> "hardware profile." But if bus configuration is part of this profile,
> then we will have much higher cardinality. If this information is
> important to have, we can make it work. But if not, it would be better
> to try to save storage.

These three issues seem easily solvable by some client-side filtering
before sending it to the server.

> 5. lshw doesn't seem to have a way to separate "system hardware" from
> "transient hardware." To be fair, this may be impossible. But it would
> be nice to understand the difference between, say, my bluetooth radio
> and my YubiKey. I can't easily remove the former but I can the latter.
> This also goes to cardinality reduction.

That does sound nice and I don't know whether or not it is possible, but
it doesn't sound unique to lshw and if there's a solution I imagine lshw
would like to have it.

> 6. lshw mixes things that are transient with things that are
> permanent. We can remove the timestamps with -notime. But, for
> example, it reports the kernel module currently assigned to a piece of
> hardware. This is probably relevant information, but we will have to
> carefully separate the data based on its lifespan.

I think we'll want the module information. I do agree we need to
think carefully about how to model the data.

Generally, I agree that lshw isn't perfect, but it seems like there's a
lot of value in it and we can help improve it where it makes sense.
Where it doesn't make sense, we can easily manipulate and supplement the
data client-side. Of course, my opinion may change dramatically once we
actually implement this :)


-- 
Jeremy Cline
XMPP: jer...@jcline.org
IRC:  jcline



___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-26 Thread Benson Muite



On 11/22/2017 02:23 AM, Benson Muite wrote:



On 11/16/2017 06:39 PM, Justin Forbes wrote:
On Wed, Nov 15, 2017 at 4:45 AM, Hans de Goede  
wrote:




Does Census collect info on the CPU the user has and which
"flags" from /proc/cpuinfo. About once a year we have a discussion
on fedora-devel about for example unconditionally using SSE2 everywhere,
and for those discussions have CPU info would be really useful.

If it is not collecting this type of information at the moment, I would

say it is critical information to grab.

Justin

Will one be able to opt out or easily choose what information is sent? 
What happens to a database entry when one changes a device (eg. GPU or 
RAM upgrade)?



A possible solution to some of the privacy issues:

https://github.com/sharemind-sdk/build-sdk

Some care required in setup and maintenance.
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-21 Thread Benson Muite



On 11/16/2017 06:39 PM, Justin Forbes wrote:

On Wed, Nov 15, 2017 at 4:45 AM, Hans de Goede  wrote:



Does Census collect info on the CPU the user has and which
"flags" from /proc/cpuinfo. About once a year we have a discussion
on fedora-devel about for example unconditionally using SSE2 everywhere,
and for those discussions have CPU info would be really useful.

If it is not collecting this type of information at the moment, I would

say it is critical information to grab.

Justin

Will one be able to opt out or easily choose what information is sent? 
What happens to a database entry when one changes a device (eg. GPU or 
RAM upgrade)?

___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-16 Thread Justin Forbes
On Wed, Nov 15, 2017 at 4:45 AM, Hans de Goede  wrote:

>
> Does Census collect info on the CPU the user has and which
> "flags" from /proc/cpuinfo. About once a year we have a discussion
> on fedora-devel about for example unconditionally using SSE2 everywhere,
> and for those discussions have CPU info would be really useful.
>
> If it is not collecting this type of information at the moment, I would
say it is critical information to grab.

Justin
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-15 Thread Jeremy Cline
On 11/15/2017 05:45 AM, Hans de Goede wrote:> Does Census collect info
on the CPU the user has and which
> "flags" from /proc/cpuinfo. About once a year we have a discussion
> on fedora-devel about for example unconditionally using SSE2 everywhere,
> and for those discussions have CPU info would be really useful.

It does not currently do that, but lshw does pick that up so if we opt
to go that route it'll get collected as part of that.


-- 
Jeremy Cline
XMPP: jer...@jcline.org
IRC:  jcline

___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-15 Thread Hans de Goede

Hi,

On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:

Hey folks,

For some time now, Fedora has operated without a database of hardware
users have. Smolt, the old hardware database, was retired in 2012[0] and
its intended successor[1] was never deployed by Fedora Infrastructure.

It would be nice to have a hardware database, so I (and hopefully some
others) would like to get Census up and running for Fedora. Before we
look at deploying Census, however, it would be good to make sure it has
everything we need.

Census has client plugins to collect information[2]. At the moment, it
has plugins for:

* The vendor, device, subsystem_vendor, subsystem_device, and class from
  each PCI device

* The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
  as well as the bInterfaceClass, bInterfaceSubClass, and
  bInterfaceProtocol for each interface

* The contents of /etc/os-release

* All the RPMs installed on a system

Other than the drivers bound to the PCI and USB devices (which is an
open PR[3]), what else would be good to collect?

[0] https://fedoraproject.org/wiki/Smolt_retirement
[1] https://github.com/npmccallum/census
[2] https://github.com/npmccallum/census/blob/master/client/plugins/
[3] https://github.com/npmccallum/census/pull/3


Does Census collect info on the CPU the user has and which
"flags" from /proc/cpuinfo. About once a year we have a discussion
on fedora-devel about for example unconditionally using SSE2 everywhere,
and for those discussions have CPU info would be really useful.

Regards,

Hans
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-13 Thread Nathaniel McCallum
8. lshw only shows the USB interfaces in the current configuration. I
presume because the kernel only shows this information in /sys.
However, lsusb is able to show the interfaces on all configuration
descriptors (it queries using libusb).

On Fri, Nov 10, 2017 at 2:19 PM, Don Zickus  wrote:
> On Fri, Nov 10, 2017 at 01:17:50PM -0500, Nathaniel McCallum wrote:
>> The more I look at lshw, the more I'm ambivalent (I'm not against it,
>> just not for it either). It certainly collects a lot of relevant
>> information. However, I see the following problems.
>>
>> 1. lshw tries to make things human readable. This is bad for
>> databases. We want to record things like immutable numeric hardware
>> IDs rather than the current contents of pci.ids. lshw does provide
>> -numeric, but this just adds the numbers to the human readable
>> strings.
>>
>> 2. lshw collects a lot of data we may not be interested in. For
>> example, it reports the four different kinds of floppy disk drives my
>> BIOS supports. Also, L1, L2 and L3 cache. There is a bunch of stuff
>> like this. Is it useful? Some of it definitely is. But transferring
>> unwanted data over the wire is not being a good netizen. Also, some
>> people pay for data per MB transferred. We should be respectful.
>>
>> 3. lshw supports '-sanitize', but this merely replaces the values with
>> '[REMOVED]'. See above for why we don't want to transmit this data.
>>
>> 4. lshw reports bus configuration. I'm not sure if this is relevant or
>> how we would want to map this data. For example, if Dell uses the same
>> hardware in a bunch of laptops then we can deduplicate this to one
>> "hardware profile." But if bus configuration is part of this profile,
>> then we will have much higher cardinality. If this information is
>> important to have, we can make it work. But if not, it would be better
>> to try to save storage.
>>
>> 5. lshw doesn't seem to have a way to separate "system hardware" from
>> "transient hardware." To be fair, this may be impossible. But it would
>> be nice to understand the difference between, say, my bluetooth radio
>> and my YubiKey. I can't easily remove the former but I can the latter.
>> This also goes to cardinality reduction.
>>
>> 6. lshw mixes things that are transient with things that are
>> permanent. We can remove the timestamps with -notime. But, for
>> example, it reports the kernel module currently assigned to a piece of
>> hardware. This is probably relevant information, but we will have to
>> carefully separate the data based on its lifespan.
>>
>> 7. lshw reports virtual NICs as physical ones. We probably don't care
>> about this and I certainly don't want to bloat the database with
>> everyone's docker subnets unnecessarily.
>
> Again, I was just pointing out a tool we use internally for inventorying our
> hardware that I thought might be useful.  If it doesn't work for you, feel
> free to choose something else. :-)  I do appreciate the feedback you
> provided.  I might work with folks on my team to address some of them as
> they could be of benefit for our work too.
>
> Cheers,
> Don
>
>
>>
>>
>> On Fri, Nov 10, 2017 at 12:38 PM, Jeremy Cline  wrote:
>> > On 11/09/2017 09:12 AM, Don Zickus wrote:
>> >> On Wed, Nov 08, 2017 at 05:02:05PM -0500, Nathaniel McCallum wrote:
>> >>> It isn't documented in F27, but it does work. However, we probably
>> >>> want at least this patch:
>> >>> https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4
>> >>
>> >> And some beaker stuff looks interesting in
>> >>
>> >> https://github.com/lyonel/lshw/commit/f95aa917a84a8ee74ce79e9b4f9e198d21a2e4d9
>> >>
>> >> Regardless.  My overall point was the lshw tool seems to embody a lot of
>> >> what you were looking for and thought it could be useful (with some more
>> >> fixes) instead of re-inventing the wheel with new plugins. :-)
>> >>
>> >> Up to you guys.
>> >
>> > I have no desire to re-invent wheels so I think it makes sense for us to
>> > use lshw. Knowing it's being used internally is also good.
>> >
>> >
>> > --
>> > Jeremy Cline
>> > XMPP: jer...@jcline.org
>> > IRC:  jcline
>> >
>> ___
>> kernel mailing list -- kernel@lists.fedoraproject.org
>> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-10 Thread Don Zickus
On Fri, Nov 10, 2017 at 01:17:50PM -0500, Nathaniel McCallum wrote:
> The more I look at lshw, the more I'm ambivalent (I'm not against it,
> just not for it either). It certainly collects a lot of relevant
> information. However, I see the following problems.
> 
> 1. lshw tries to make things human readable. This is bad for
> databases. We want to record things like immutable numeric hardware
> IDs rather than the current contents of pci.ids. lshw does provide
> -numeric, but this just adds the numbers to the human readable
> strings.
> 
> 2. lshw collects a lot of data we may not be interested in. For
> example, it reports the four different kinds of floppy disk drives my
> BIOS supports. Also, L1, L2 and L3 cache. There is a bunch of stuff
> like this. Is it useful? Some of it definitely is. But transferring
> unwanted data over the wire is not being a good netizen. Also, some
> people pay for data per MB transferred. We should be respectful.
> 
> 3. lshw supports '-sanitize', but this merely replaces the values with
> '[REMOVED]'. See above for why we don't want to transmit this data.
> 
> 4. lshw reports bus configuration. I'm not sure if this is relevant or
> how we would want to map this data. For example, if Dell uses the same
> hardware in a bunch of laptops then we can deduplicate this to one
> "hardware profile." But if bus configuration is part of this profile,
> then we will have much higher cardinality. If this information is
> important to have, we can make it work. But if not, it would be better
> to try to save storage.
> 
> 5. lshw doesn't seem to have a way to separate "system hardware" from
> "transient hardware." To be fair, this may be impossible. But it would
> be nice to understand the difference between, say, my bluetooth radio
> and my YubiKey. I can't easily remove the former but I can the latter.
> This also goes to cardinality reduction.
> 
> 6. lshw mixes things that are transient with things that are
> permanent. We can remove the timestamps with -notime. But, for
> example, it reports the kernel module currently assigned to a piece of
> hardware. This is probably relevant information, but we will have to
> carefully separate the data based on its lifespan.
> 
> 7. lshw reports virtual NICs as physical ones. We probably don't care
> about this and I certainly don't want to bloat the database with
> everyone's docker subnets unnecessarily.

Again, I was just pointing out a tool we use internally for inventorying our
hardware that I thought might be useful.  If it doesn't work for you, feel
free to choose something else. :-)  I do appreciate the feedback you
provided.  I might work with folks on my team to address some of them as
they could be of benefit for our work too.

Cheers,
Don


> 
> 
> On Fri, Nov 10, 2017 at 12:38 PM, Jeremy Cline  wrote:
> > On 11/09/2017 09:12 AM, Don Zickus wrote:
> >> On Wed, Nov 08, 2017 at 05:02:05PM -0500, Nathaniel McCallum wrote:
> >>> It isn't documented in F27, but it does work. However, we probably
> >>> want at least this patch:
> >>> https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4
> >>
> >> And some beaker stuff looks interesting in
> >>
> >> https://github.com/lyonel/lshw/commit/f95aa917a84a8ee74ce79e9b4f9e198d21a2e4d9
> >>
> >> Regardless.  My overall point was the lshw tool seems to embody a lot of
> >> what you were looking for and thought it could be useful (with some more
> >> fixes) instead of re-inventing the wheel with new plugins. :-)
> >>
> >> Up to you guys.
> >
> > I have no desire to re-invent wheels so I think it makes sense for us to
> > use lshw. Knowing it's being used internally is also good.
> >
> >
> > --
> > Jeremy Cline
> > XMPP: jer...@jcline.org
> > IRC:  jcline
> >
> ___
> kernel mailing list -- kernel@lists.fedoraproject.org
> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-10 Thread Nathaniel McCallum
The more I look at lshw, the more I'm ambivalent (I'm not against it,
just not for it either). It certainly collects a lot of relevant
information. However, I see the following problems.

1. lshw tries to make things human readable. This is bad for
databases. We want to record things like immutable numeric hardware
IDs rather than the current contents of pci.ids. lshw does provide
-numeric, but this just adds the numbers to the human readable
strings.

2. lshw collects a lot of data we may not be interested in. For
example, it reports the four different kinds of floppy disk drives my
BIOS supports. Also, L1, L2 and L3 cache. There is a bunch of stuff
like this. Is it useful? Some of it definitely is. But transferring
unwanted data over the wire is not being a good netizen. Also, some
people pay for data per MB transferred. We should be respectful.

3. lshw supports '-sanitize', but this merely replaces the values with
'[REMOVED]'. See above for why we don't want to transmit this data.

4. lshw reports bus configuration. I'm not sure if this is relevant or
how we would want to map this data. For example, if Dell uses the same
hardware in a bunch of laptops then we can deduplicate this to one
"hardware profile." But if bus configuration is part of this profile,
then we will have much higher cardinality. If this information is
important to have, we can make it work. But if not, it would be better
to try to save storage.

5. lshw doesn't seem to have a way to separate "system hardware" from
"transient hardware." To be fair, this may be impossible. But it would
be nice to understand the difference between, say, my bluetooth radio
and my YubiKey. I can't easily remove the former but I can the latter.
This also goes to cardinality reduction.

6. lshw mixes things that are transient with things that are
permanent. We can remove the timestamps with -notime. But, for
example, it reports the kernel module currently assigned to a piece of
hardware. This is probably relevant information, but we will have to
carefully separate the data based on its lifespan.

7. lshw reports virtual NICs as physical ones. We probably don't care
about this and I certainly don't want to bloat the database with
everyone's docker subnets unnecessarily.


On Fri, Nov 10, 2017 at 12:38 PM, Jeremy Cline  wrote:
> On 11/09/2017 09:12 AM, Don Zickus wrote:
>> On Wed, Nov 08, 2017 at 05:02:05PM -0500, Nathaniel McCallum wrote:
>>> It isn't documented in F27, but it does work. However, we probably
>>> want at least this patch:
>>> https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4
>>
>> And some beaker stuff looks interesting in
>>
>> https://github.com/lyonel/lshw/commit/f95aa917a84a8ee74ce79e9b4f9e198d21a2e4d9
>>
>> Regardless.  My overall point was the lshw tool seems to embody a lot of
>> what you were looking for and thought it could be useful (with some more
>> fixes) instead of re-inventing the wheel with new plugins. :-)
>>
>> Up to you guys.
>
> I have no desire to re-invent wheels so I think it makes sense for us to
> use lshw. Knowing it's being used internally is also good.
>
>
> --
> Jeremy Cline
> XMPP: jer...@jcline.org
> IRC:  jcline
>
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-10 Thread Jeremy Cline
On 11/09/2017 09:12 AM, Don Zickus wrote:
> On Wed, Nov 08, 2017 at 05:02:05PM -0500, Nathaniel McCallum wrote:
>> It isn't documented in F27, but it does work. However, we probably
>> want at least this patch:
>> https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4
> 
> And some beaker stuff looks interesting in
> 
> https://github.com/lyonel/lshw/commit/f95aa917a84a8ee74ce79e9b4f9e198d21a2e4d9
> 
> Regardless.  My overall point was the lshw tool seems to embody a lot of
> what you were looking for and thought it could be useful (with some more
> fixes) instead of re-inventing the wheel with new plugins. :-)
> 
> Up to you guys.

I have no desire to re-invent wheels so I think it makes sense for us to
use lshw. Knowing it's being used internally is also good.


-- 
Jeremy Cline
XMPP: jer...@jcline.org
IRC:  jcline

___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-09 Thread Nathaniel McCallum
On Thu, Nov 9, 2017 at 3:26 PM, Chris Murphy  wrote:
> On Thu, Nov 9, 2017 at 1:11 PM, Nathaniel McCallum
>  wrote:
>> Turning it into a hash doesn't solve the tracking problem. It only
>> prevents the attacker from knowing a list of serial numbers. I suspect
>> keeping hashes of identifying information will likely cause
>> controversy.
>
> What is the nature of the tracking problem? A single entry for a
> single machine is not tracking to me. Tracking requires at least two
> points in space-time. What's being stored by the Fedora Project? IP,
> Geolocation, date and time? Those are the things I associate with
> tracking more than a serial number or a hash of a serial number.

I'm an attacker. I observe the serial number for someone's laptop
("Hey! Nice laptop! I'm thinking about getting one of those.How heavy
is it?" ). I hash it. I search the Fedora
database. I have a list of every package ever installed and all the
hardware he's ever plugged in. The opportunity for compromise just
grew exponentially.

We need to never collect these things. It is part of communicating
trust to our users.

> Let's say you don't store serial number or a hash, but you do store
> model information, date/time, and an IP address.

We won't collect or store IP addresses. We will, of necessity, have
the IP address of the connection. But we should avoid logging this.

> If there's no
> mechanism to avoid duplicate entries, you've got a bigger tracking
> problem the less common that particular model is.

Less common models are still relatively common. I don't understand the
"duplicate entries" problem you are posing. Individual hardware
devices should be deduplicated (via a UNIQUE constraint). Everything
will be tied via a join table to a master checkin table where the
installation is uniquely identified by a UUID. This way we can see the
hardware associated with a particular checkin (and view changes over
time).

> More models will
> make the data noisy. But if it's a sufficiently rare model, the
> duplicate entries can be assumed to be representing just a few
> distinct machines or even just one machine, and now you can track a
> person even if you don't have any serial number or hashing.

Yes, the problem of very unique hardware combinations is known. It is
not solved even for browsers. Users concerned about this should
disable reporting.

> So I think necessarily you need a way to eliminate duplicates from
> entering the data set. Some way of anonymizing the entry in the Fedora
> Project's data, but also a way to track duplicates.
>
> How about two different data sets stored by the Fedora Project?
> Dataset 1 contains only the hash of the serial number of the device.
> If that hash is not present in dataset 1, then sanitized device data
> is added to dataset 2. If the hash is found in dataset 1, then it's
> not added to dataset 2. But there is no correlation between dataset 1
> and dataset 2?

It looks to me like you're trying to correlate a single hardware
configuration to provide consistency across checkins and you call this
process "deduplication." Census already provides functionality to do
that. Census is not designed just for hardware. It is designed for
gathering general Linux distro statistics, of which hardware is one
component. So we solve this problem once for all reporting modules (of
which hardware is one).

All checkins report a UUID inserted into the master checkins table.
The checkin entry is UNIQUE(uuid, time). All other data points are
correlated to this checkin entry. Let's walk through an example.

A user reports a single PCI device with census. This results in:
1. a new row in the checkin table
2. a possibly new row in the pci table (if this is the first time
we've ever seen this device for all users)
3. a new row in the checkin_pci table joining the checkin to the pci device

As data usage becomes an issue, we can periodically purge old checkins
and their joins. But once a PCI device is seen for the first time, it
is never deleted. And we will never have duplicate entries for a
single PCI device.

This is just theoretical, because we haven't actually designed the
hardware side of the database. But it is an example.
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-09 Thread Chris Murphy
On Thu, Nov 9, 2017 at 1:11 PM, Nathaniel McCallum
 wrote:
> Turning it into a hash doesn't solve the tracking problem. It only
> prevents the attacker from knowing a list of serial numbers. I suspect
> keeping hashes of identifying information will likely cause
> controversy.

What is the nature of the tracking problem? A single entry for a
single machine is not tracking to me. Tracking requires at least two
points in space-time. What's being stored by the Fedora Project? IP,
Geolocation, date and time? Those are the things I associate with
tracking more than a serial number or a hash of a serial number.

Let's say you don't store serial number or a hash, but you do store
model information, date/time, and an IP address. If there's no
mechanism to avoid duplicate entries, you've got a bigger tracking
problem the less common that particular model is. More models will
make the data noisy. But if it's a sufficiently rare model, the
duplicate entries can be assumed to be representing just a few
distinct machines or even just one machine, and now you can track a
person even if you don't have any serial number or hashing.

So I think necessarily you need a way to eliminate duplicates from
entering the data set. Some way of anonymizing the entry in the Fedora
Project's data, but also a way to track duplicates.

How about two different data sets stored by the Fedora Project?
Dataset 1 contains only the hash of the serial number of the device.
If that hash is not present in dataset 1, then sanitized device data
is added to dataset 2. If the hash is found in dataset 1, then it's
not added to dataset 2. But there is no correlation between dataset 1
and dataset 2?


-- 
Chris Murphy
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-09 Thread Nathaniel McCallum
Turning it into a hash doesn't solve the tracking problem. It only
prevents the attacker from knowing a list of serial numbers. I suspect
keeping hashes of identifying information will likely cause
controversy.

On Thu, Nov 9, 2017 at 2:12 PM, Chris Murphy  wrote:
> On Thu, Nov 9, 2017 at 7:27 AM, Don Zickus  wrote:
>> On Thu, Nov 09, 2017 at 09:19:04AM -0500, Nathaniel McCallum wrote:
>>> Agreed completely. But I still need someone with experience using lshw
>>> to write a data processor (json => SQL) for that data. Also, we will
>>> need to sanitize the lshw output to ensure we omit identifying
>>> information. For example, ip addresses on the network interfaces need
>>> to be filtered out. It might be better to write an option for upstream
>>> lshw to anonymize the output.
>>
>> You mean like 'lshw -sanitize'? :-)
>
> I haven't looked at how sanitize works, but it's probably useful for
> the database to avoid duplicate entries. I think it'd be better to
> turn something like the product serial number into a hash, and then
> use the hash to avoid duplicate entries.
>
> --
> Chris Murphy
> ___
> kernel mailing list -- kernel@lists.fedoraproject.org
> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-09 Thread Chris Murphy
On Thu, Nov 9, 2017 at 7:27 AM, Don Zickus  wrote:
> On Thu, Nov 09, 2017 at 09:19:04AM -0500, Nathaniel McCallum wrote:
>> Agreed completely. But I still need someone with experience using lshw
>> to write a data processor (json => SQL) for that data. Also, we will
>> need to sanitize the lshw output to ensure we omit identifying
>> information. For example, ip addresses on the network interfaces need
>> to be filtered out. It might be better to write an option for upstream
>> lshw to anonymize the output.
>
> You mean like 'lshw -sanitize'? :-)

I haven't looked at how sanitize works, but it's probably useful for
the database to avoid duplicate entries. I think it'd be better to
turn something like the product serial number into a hash, and then
use the hash to avoid duplicate entries.

-- 
Chris Murphy
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-09 Thread Nathaniel McCallum
Wow! You're fast at getting stuff upstream! ;)

We still need a volunteer for the census side of things.

On Thu, Nov 9, 2017 at 9:27 AM, Don Zickus  wrote:
> On Thu, Nov 09, 2017 at 09:19:04AM -0500, Nathaniel McCallum wrote:
>> Agreed completely. But I still need someone with experience using lshw
>> to write a data processor (json => SQL) for that data. Also, we will
>> need to sanitize the lshw output to ensure we omit identifying
>> information. For example, ip addresses on the network interfaces need
>> to be filtered out. It might be better to write an option for upstream
>> lshw to anonymize the output.
>
> You mean like 'lshw -sanitize'? :-)
>
> Cheers,
> Don
>
>>
>> Any volunteers to work with me on those two items?
>>
>> On Thu, Nov 9, 2017 at 9:12 AM, Don Zickus  wrote:
>> > On Wed, Nov 08, 2017 at 05:02:05PM -0500, Nathaniel McCallum wrote:
>> >> It isn't documented in F27, but it does work. However, we probably
>> >> want at least this patch:
>> >> https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4
>> >
>> > And some beaker stuff looks interesting in
>> >
>> > https://github.com/lyonel/lshw/commit/f95aa917a84a8ee74ce79e9b4f9e198d21a2e4d9
>> >
>> > Regardless.  My overall point was the lshw tool seems to embody a lot of
>> > what you were looking for and thought it could be useful (with some more
>> > fixes) instead of re-inventing the wheel with new plugins. :-)
>> >
>> > Up to you guys.
>> >
>> > Cheers,
>> > Don
>> >
>> >>
>> >> On Wed, Nov 8, 2017 at 4:28 PM, Don Zickus  wrote:
>> >> > On Wed, Nov 08, 2017 at 04:09:26PM -0500, Nathaniel McCallum wrote:
>> >> >> I just looked at the code for lshw. The master branch already supports
>> >> >> JSON. We just need them to release it.
>> >> >
>> >> > Eh?  'lshw -json' doesn't work for you?  I thought that was a supported
>> >> > output for a while now.  At least it works on my F27 box, but I think we
>> >> > have it running successfully under RHEL-7 too.
>> >> >
>> >> > Cheers,
>> >> > Don
>> >> >
>> >> >>
>> >> >> On Wed, Nov 8, 2017 at 3:23 PM, Don Zickus  wrote:
>> >> >> > On Wed, Nov 08, 2017 at 03:16:24PM -0500, Nathaniel McCallum wrote:
>> >> >> >> I just played around with lshw a bit. We should totally make it 
>> >> >> >> export
>> >> >> >> JSON. We can then submit this directly (as one census plugin).
>> >> >> >
>> >> >> > Yes, that is how we use it to update hardware info internally to our 
>> >> >> > Beaker
>> >> >> > instance. :-)
>> >> >> >
>> >> >> > Cheers,
>> >> >> > Don
>> >> >> >
>> >> >> >>
>> >> >> >> On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  
>> >> >> >> wrote:
>> >> >> >> > On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
>> >> >> >> >> Hey folks,
>> >> >> >> >>
>> >> >> >> >> For some time now, Fedora has operated without a database of 
>> >> >> >> >> hardware
>> >> >> >> >> users have. Smolt, the old hardware database, was retired in 
>> >> >> >> >> 2012[0] and
>> >> >> >> >> its intended successor[1] was never deployed by Fedora 
>> >> >> >> >> Infrastructure.
>> >> >> >> >>
>> >> >> >> >> It would be nice to have a hardware database, so I (and 
>> >> >> >> >> hopefully some
>> >> >> >> >> others) would like to get Census up and running for Fedora. 
>> >> >> >> >> Before we
>> >> >> >> >> look at deploying Census, however, it would be good to make sure 
>> >> >> >> >> it has
>> >> >> >> >> everything we need.
>> >> >> >> >>
>> >> >> >> >> Census has client plugins to collect information[2]. At the 
>> >> >> >> >> moment, it
>> >> >> >> >> has plugins for:
>> >> >> >> >>
>> >> >> >> >> * The vendor, device, subsystem_vendor, subsystem_device, and 
>> >> >> >> >> class from
>> >> >> >> >>   each PCI device
>> >> >> >> >>
>> >> >> >> >> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB 
>> >> >> >> >> devices
>> >> >> >> >>   as well as the bInterfaceClass, bInterfaceSubClass, and
>> >> >> >> >>   bInterfaceProtocol for each interface
>> >> >> >> >>
>> >> >> >> >> * The contents of /etc/os-release
>> >> >> >> >>
>> >> >> >> >> * All the RPMs installed on a system
>> >> >> >> >>
>> >> >> >> >> Other than the drivers bound to the PCI and USB devices (which 
>> >> >> >> >> is an
>> >> >> >> >> open PR[3]), what else would be good to collect?
>> >> >> >> >>
>> >> >> >> >> [0] https://fedoraproject.org/wiki/Smolt_retirement
>> >> >> >> >> [1] https://github.com/npmccallum/census
>> >> >> >> >> [2] 
>> >> >> >> >> https://github.com/npmccallum/census/blob/master/client/plugins/
>> >> >> >> >> [3] https://github.com/npmccallum/census/pull/3
>> >> >> >> >
>> >> >> >> > Internally, we have been focusing on using 'lshw' as the tool 
>> >> >> >> > that provides
>> >> >> >> > all that info and handles all the arch funkiness (and includes 
>> >> >> >> > firmware).
>> >> >> >> > If there is anything missing, we have tried to push upstream to 
>> >> >> >> > that
>> >> >> >> > project.

Re: Reviving the hardware census

2017-11-09 Thread Don Zickus
On Thu, Nov 09, 2017 at 09:19:04AM -0500, Nathaniel McCallum wrote:
> Agreed completely. But I still need someone with experience using lshw
> to write a data processor (json => SQL) for that data. Also, we will
> need to sanitize the lshw output to ensure we omit identifying
> information. For example, ip addresses on the network interfaces need
> to be filtered out. It might be better to write an option for upstream
> lshw to anonymize the output.

You mean like 'lshw -sanitize'? :-)

Cheers,
Don

> 
> Any volunteers to work with me on those two items?
> 
> On Thu, Nov 9, 2017 at 9:12 AM, Don Zickus  wrote:
> > On Wed, Nov 08, 2017 at 05:02:05PM -0500, Nathaniel McCallum wrote:
> >> It isn't documented in F27, but it does work. However, we probably
> >> want at least this patch:
> >> https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4
> >
> > And some beaker stuff looks interesting in
> >
> > https://github.com/lyonel/lshw/commit/f95aa917a84a8ee74ce79e9b4f9e198d21a2e4d9
> >
> > Regardless.  My overall point was the lshw tool seems to embody a lot of
> > what you were looking for and thought it could be useful (with some more
> > fixes) instead of re-inventing the wheel with new plugins. :-)
> >
> > Up to you guys.
> >
> > Cheers,
> > Don
> >
> >>
> >> On Wed, Nov 8, 2017 at 4:28 PM, Don Zickus  wrote:
> >> > On Wed, Nov 08, 2017 at 04:09:26PM -0500, Nathaniel McCallum wrote:
> >> >> I just looked at the code for lshw. The master branch already supports
> >> >> JSON. We just need them to release it.
> >> >
> >> > Eh?  'lshw -json' doesn't work for you?  I thought that was a supported
> >> > output for a while now.  At least it works on my F27 box, but I think we
> >> > have it running successfully under RHEL-7 too.
> >> >
> >> > Cheers,
> >> > Don
> >> >
> >> >>
> >> >> On Wed, Nov 8, 2017 at 3:23 PM, Don Zickus  wrote:
> >> >> > On Wed, Nov 08, 2017 at 03:16:24PM -0500, Nathaniel McCallum wrote:
> >> >> >> I just played around with lshw a bit. We should totally make it 
> >> >> >> export
> >> >> >> JSON. We can then submit this directly (as one census plugin).
> >> >> >
> >> >> > Yes, that is how we use it to update hardware info internally to our 
> >> >> > Beaker
> >> >> > instance. :-)
> >> >> >
> >> >> > Cheers,
> >> >> > Don
> >> >> >
> >> >> >>
> >> >> >> On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  
> >> >> >> wrote:
> >> >> >> > On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
> >> >> >> >> Hey folks,
> >> >> >> >>
> >> >> >> >> For some time now, Fedora has operated without a database of 
> >> >> >> >> hardware
> >> >> >> >> users have. Smolt, the old hardware database, was retired in 
> >> >> >> >> 2012[0] and
> >> >> >> >> its intended successor[1] was never deployed by Fedora 
> >> >> >> >> Infrastructure.
> >> >> >> >>
> >> >> >> >> It would be nice to have a hardware database, so I (and hopefully 
> >> >> >> >> some
> >> >> >> >> others) would like to get Census up and running for Fedora. 
> >> >> >> >> Before we
> >> >> >> >> look at deploying Census, however, it would be good to make sure 
> >> >> >> >> it has
> >> >> >> >> everything we need.
> >> >> >> >>
> >> >> >> >> Census has client plugins to collect information[2]. At the 
> >> >> >> >> moment, it
> >> >> >> >> has plugins for:
> >> >> >> >>
> >> >> >> >> * The vendor, device, subsystem_vendor, subsystem_device, and 
> >> >> >> >> class from
> >> >> >> >>   each PCI device
> >> >> >> >>
> >> >> >> >> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB 
> >> >> >> >> devices
> >> >> >> >>   as well as the bInterfaceClass, bInterfaceSubClass, and
> >> >> >> >>   bInterfaceProtocol for each interface
> >> >> >> >>
> >> >> >> >> * The contents of /etc/os-release
> >> >> >> >>
> >> >> >> >> * All the RPMs installed on a system
> >> >> >> >>
> >> >> >> >> Other than the drivers bound to the PCI and USB devices (which is 
> >> >> >> >> an
> >> >> >> >> open PR[3]), what else would be good to collect?
> >> >> >> >>
> >> >> >> >> [0] https://fedoraproject.org/wiki/Smolt_retirement
> >> >> >> >> [1] https://github.com/npmccallum/census
> >> >> >> >> [2] 
> >> >> >> >> https://github.com/npmccallum/census/blob/master/client/plugins/
> >> >> >> >> [3] https://github.com/npmccallum/census/pull/3
> >> >> >> >
> >> >> >> > Internally, we have been focusing on using 'lshw' as the tool that 
> >> >> >> > provides
> >> >> >> > all that info and handles all the arch funkiness (and includes 
> >> >> >> > firmware).
> >> >> >> > If there is anything missing, we have tried to push upstream to 
> >> >> >> > that
> >> >> >> > project.
> >> >> >> >
> >> >> >> > Would that cover a lot of the info you are looking for?
> >> >> >> >
> >> >> >> > Cheers,
> >> >> >> > Don
> >> >> >> ___
> >> >> >> kernel mailing list -- kernel@lists.fedoraproject.org
> >> >> >> To unsubscribe send an 

Re: Reviving the hardware census

2017-11-09 Thread Nathaniel McCallum
Agreed completely. But I still need someone with experience using lshw
to write a data processor (json => SQL) for that data. Also, we will
need to sanitize the lshw output to ensure we omit identifying
information. For example, ip addresses on the network interfaces need
to be filtered out. It might be better to write an option for upstream
lshw to anonymize the output.

Any volunteers to work with me on those two items?

On Thu, Nov 9, 2017 at 9:12 AM, Don Zickus  wrote:
> On Wed, Nov 08, 2017 at 05:02:05PM -0500, Nathaniel McCallum wrote:
>> It isn't documented in F27, but it does work. However, we probably
>> want at least this patch:
>> https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4
>
> And some beaker stuff looks interesting in
>
> https://github.com/lyonel/lshw/commit/f95aa917a84a8ee74ce79e9b4f9e198d21a2e4d9
>
> Regardless.  My overall point was the lshw tool seems to embody a lot of
> what you were looking for and thought it could be useful (with some more
> fixes) instead of re-inventing the wheel with new plugins. :-)
>
> Up to you guys.
>
> Cheers,
> Don
>
>>
>> On Wed, Nov 8, 2017 at 4:28 PM, Don Zickus  wrote:
>> > On Wed, Nov 08, 2017 at 04:09:26PM -0500, Nathaniel McCallum wrote:
>> >> I just looked at the code for lshw. The master branch already supports
>> >> JSON. We just need them to release it.
>> >
>> > Eh?  'lshw -json' doesn't work for you?  I thought that was a supported
>> > output for a while now.  At least it works on my F27 box, but I think we
>> > have it running successfully under RHEL-7 too.
>> >
>> > Cheers,
>> > Don
>> >
>> >>
>> >> On Wed, Nov 8, 2017 at 3:23 PM, Don Zickus  wrote:
>> >> > On Wed, Nov 08, 2017 at 03:16:24PM -0500, Nathaniel McCallum wrote:
>> >> >> I just played around with lshw a bit. We should totally make it export
>> >> >> JSON. We can then submit this directly (as one census plugin).
>> >> >
>> >> > Yes, that is how we use it to update hardware info internally to our 
>> >> > Beaker
>> >> > instance. :-)
>> >> >
>> >> > Cheers,
>> >> > Don
>> >> >
>> >> >>
>> >> >> On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  wrote:
>> >> >> > On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
>> >> >> >> Hey folks,
>> >> >> >>
>> >> >> >> For some time now, Fedora has operated without a database of 
>> >> >> >> hardware
>> >> >> >> users have. Smolt, the old hardware database, was retired in 
>> >> >> >> 2012[0] and
>> >> >> >> its intended successor[1] was never deployed by Fedora 
>> >> >> >> Infrastructure.
>> >> >> >>
>> >> >> >> It would be nice to have a hardware database, so I (and hopefully 
>> >> >> >> some
>> >> >> >> others) would like to get Census up and running for Fedora. Before 
>> >> >> >> we
>> >> >> >> look at deploying Census, however, it would be good to make sure it 
>> >> >> >> has
>> >> >> >> everything we need.
>> >> >> >>
>> >> >> >> Census has client plugins to collect information[2]. At the moment, 
>> >> >> >> it
>> >> >> >> has plugins for:
>> >> >> >>
>> >> >> >> * The vendor, device, subsystem_vendor, subsystem_device, and class 
>> >> >> >> from
>> >> >> >>   each PCI device
>> >> >> >>
>> >> >> >> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB 
>> >> >> >> devices
>> >> >> >>   as well as the bInterfaceClass, bInterfaceSubClass, and
>> >> >> >>   bInterfaceProtocol for each interface
>> >> >> >>
>> >> >> >> * The contents of /etc/os-release
>> >> >> >>
>> >> >> >> * All the RPMs installed on a system
>> >> >> >>
>> >> >> >> Other than the drivers bound to the PCI and USB devices (which is an
>> >> >> >> open PR[3]), what else would be good to collect?
>> >> >> >>
>> >> >> >> [0] https://fedoraproject.org/wiki/Smolt_retirement
>> >> >> >> [1] https://github.com/npmccallum/census
>> >> >> >> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
>> >> >> >> [3] https://github.com/npmccallum/census/pull/3
>> >> >> >
>> >> >> > Internally, we have been focusing on using 'lshw' as the tool that 
>> >> >> > provides
>> >> >> > all that info and handles all the arch funkiness (and includes 
>> >> >> > firmware).
>> >> >> > If there is anything missing, we have tried to push upstream to that
>> >> >> > project.
>> >> >> >
>> >> >> > Would that cover a lot of the info you are looking for?
>> >> >> >
>> >> >> > Cheers,
>> >> >> > Don
>> >> >> ___
>> >> >> kernel mailing list -- kernel@lists.fedoraproject.org
>> >> >> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
>> >> ___
>> >> kernel mailing list -- kernel@lists.fedoraproject.org
>> >> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
>> ___
>> kernel mailing list -- kernel@lists.fedoraproject.org
>> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org

Re: Reviving the hardware census

2017-11-09 Thread Don Zickus
On Wed, Nov 08, 2017 at 05:02:05PM -0500, Nathaniel McCallum wrote:
> It isn't documented in F27, but it does work. However, we probably
> want at least this patch:
> https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4

And some beaker stuff looks interesting in

https://github.com/lyonel/lshw/commit/f95aa917a84a8ee74ce79e9b4f9e198d21a2e4d9

Regardless.  My overall point was the lshw tool seems to embody a lot of
what you were looking for and thought it could be useful (with some more
fixes) instead of re-inventing the wheel with new plugins. :-)

Up to you guys.

Cheers,
Don

> 
> On Wed, Nov 8, 2017 at 4:28 PM, Don Zickus  wrote:
> > On Wed, Nov 08, 2017 at 04:09:26PM -0500, Nathaniel McCallum wrote:
> >> I just looked at the code for lshw. The master branch already supports
> >> JSON. We just need them to release it.
> >
> > Eh?  'lshw -json' doesn't work for you?  I thought that was a supported
> > output for a while now.  At least it works on my F27 box, but I think we
> > have it running successfully under RHEL-7 too.
> >
> > Cheers,
> > Don
> >
> >>
> >> On Wed, Nov 8, 2017 at 3:23 PM, Don Zickus  wrote:
> >> > On Wed, Nov 08, 2017 at 03:16:24PM -0500, Nathaniel McCallum wrote:
> >> >> I just played around with lshw a bit. We should totally make it export
> >> >> JSON. We can then submit this directly (as one census plugin).
> >> >
> >> > Yes, that is how we use it to update hardware info internally to our 
> >> > Beaker
> >> > instance. :-)
> >> >
> >> > Cheers,
> >> > Don
> >> >
> >> >>
> >> >> On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  wrote:
> >> >> > On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
> >> >> >> Hey folks,
> >> >> >>
> >> >> >> For some time now, Fedora has operated without a database of hardware
> >> >> >> users have. Smolt, the old hardware database, was retired in 2012[0] 
> >> >> >> and
> >> >> >> its intended successor[1] was never deployed by Fedora 
> >> >> >> Infrastructure.
> >> >> >>
> >> >> >> It would be nice to have a hardware database, so I (and hopefully 
> >> >> >> some
> >> >> >> others) would like to get Census up and running for Fedora. Before we
> >> >> >> look at deploying Census, however, it would be good to make sure it 
> >> >> >> has
> >> >> >> everything we need.
> >> >> >>
> >> >> >> Census has client plugins to collect information[2]. At the moment, 
> >> >> >> it
> >> >> >> has plugins for:
> >> >> >>
> >> >> >> * The vendor, device, subsystem_vendor, subsystem_device, and class 
> >> >> >> from
> >> >> >>   each PCI device
> >> >> >>
> >> >> >> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB 
> >> >> >> devices
> >> >> >>   as well as the bInterfaceClass, bInterfaceSubClass, and
> >> >> >>   bInterfaceProtocol for each interface
> >> >> >>
> >> >> >> * The contents of /etc/os-release
> >> >> >>
> >> >> >> * All the RPMs installed on a system
> >> >> >>
> >> >> >> Other than the drivers bound to the PCI and USB devices (which is an
> >> >> >> open PR[3]), what else would be good to collect?
> >> >> >>
> >> >> >> [0] https://fedoraproject.org/wiki/Smolt_retirement
> >> >> >> [1] https://github.com/npmccallum/census
> >> >> >> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
> >> >> >> [3] https://github.com/npmccallum/census/pull/3
> >> >> >
> >> >> > Internally, we have been focusing on using 'lshw' as the tool that 
> >> >> > provides
> >> >> > all that info and handles all the arch funkiness (and includes 
> >> >> > firmware).
> >> >> > If there is anything missing, we have tried to push upstream to that
> >> >> > project.
> >> >> >
> >> >> > Would that cover a lot of the info you are looking for?
> >> >> >
> >> >> > Cheers,
> >> >> > Don
> >> >> ___
> >> >> kernel mailing list -- kernel@lists.fedoraproject.org
> >> >> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
> >> ___
> >> kernel mailing list -- kernel@lists.fedoraproject.org
> >> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
> ___
> kernel mailing list -- kernel@lists.fedoraproject.org
> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Nathaniel McCallum
It isn't documented in F27, but it does work. However, we probably
want at least this patch:
https://github.com/lyonel/lshw/commit/135a853c60582b14c5b67e5cd988a8062d9896f4

On Wed, Nov 8, 2017 at 4:28 PM, Don Zickus  wrote:
> On Wed, Nov 08, 2017 at 04:09:26PM -0500, Nathaniel McCallum wrote:
>> I just looked at the code for lshw. The master branch already supports
>> JSON. We just need them to release it.
>
> Eh?  'lshw -json' doesn't work for you?  I thought that was a supported
> output for a while now.  At least it works on my F27 box, but I think we
> have it running successfully under RHEL-7 too.
>
> Cheers,
> Don
>
>>
>> On Wed, Nov 8, 2017 at 3:23 PM, Don Zickus  wrote:
>> > On Wed, Nov 08, 2017 at 03:16:24PM -0500, Nathaniel McCallum wrote:
>> >> I just played around with lshw a bit. We should totally make it export
>> >> JSON. We can then submit this directly (as one census plugin).
>> >
>> > Yes, that is how we use it to update hardware info internally to our Beaker
>> > instance. :-)
>> >
>> > Cheers,
>> > Don
>> >
>> >>
>> >> On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  wrote:
>> >> > On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
>> >> >> Hey folks,
>> >> >>
>> >> >> For some time now, Fedora has operated without a database of hardware
>> >> >> users have. Smolt, the old hardware database, was retired in 2012[0] 
>> >> >> and
>> >> >> its intended successor[1] was never deployed by Fedora Infrastructure.
>> >> >>
>> >> >> It would be nice to have a hardware database, so I (and hopefully some
>> >> >> others) would like to get Census up and running for Fedora. Before we
>> >> >> look at deploying Census, however, it would be good to make sure it has
>> >> >> everything we need.
>> >> >>
>> >> >> Census has client plugins to collect information[2]. At the moment, it
>> >> >> has plugins for:
>> >> >>
>> >> >> * The vendor, device, subsystem_vendor, subsystem_device, and class 
>> >> >> from
>> >> >>   each PCI device
>> >> >>
>> >> >> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>> >> >>   as well as the bInterfaceClass, bInterfaceSubClass, and
>> >> >>   bInterfaceProtocol for each interface
>> >> >>
>> >> >> * The contents of /etc/os-release
>> >> >>
>> >> >> * All the RPMs installed on a system
>> >> >>
>> >> >> Other than the drivers bound to the PCI and USB devices (which is an
>> >> >> open PR[3]), what else would be good to collect?
>> >> >>
>> >> >> [0] https://fedoraproject.org/wiki/Smolt_retirement
>> >> >> [1] https://github.com/npmccallum/census
>> >> >> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
>> >> >> [3] https://github.com/npmccallum/census/pull/3
>> >> >
>> >> > Internally, we have been focusing on using 'lshw' as the tool that 
>> >> > provides
>> >> > all that info and handles all the arch funkiness (and includes 
>> >> > firmware).
>> >> > If there is anything missing, we have tried to push upstream to that
>> >> > project.
>> >> >
>> >> > Would that cover a lot of the info you are looking for?
>> >> >
>> >> > Cheers,
>> >> > Don
>> >> ___
>> >> kernel mailing list -- kernel@lists.fedoraproject.org
>> >> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
>> ___
>> kernel mailing list -- kernel@lists.fedoraproject.org
>> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Don Zickus
On Wed, Nov 08, 2017 at 04:09:26PM -0500, Nathaniel McCallum wrote:
> I just looked at the code for lshw. The master branch already supports
> JSON. We just need them to release it.

Eh?  'lshw -json' doesn't work for you?  I thought that was a supported
output for a while now.  At least it works on my F27 box, but I think we
have it running successfully under RHEL-7 too.

Cheers,
Don

> 
> On Wed, Nov 8, 2017 at 3:23 PM, Don Zickus  wrote:
> > On Wed, Nov 08, 2017 at 03:16:24PM -0500, Nathaniel McCallum wrote:
> >> I just played around with lshw a bit. We should totally make it export
> >> JSON. We can then submit this directly (as one census plugin).
> >
> > Yes, that is how we use it to update hardware info internally to our Beaker
> > instance. :-)
> >
> > Cheers,
> > Don
> >
> >>
> >> On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  wrote:
> >> > On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
> >> >> Hey folks,
> >> >>
> >> >> For some time now, Fedora has operated without a database of hardware
> >> >> users have. Smolt, the old hardware database, was retired in 2012[0] and
> >> >> its intended successor[1] was never deployed by Fedora Infrastructure.
> >> >>
> >> >> It would be nice to have a hardware database, so I (and hopefully some
> >> >> others) would like to get Census up and running for Fedora. Before we
> >> >> look at deploying Census, however, it would be good to make sure it has
> >> >> everything we need.
> >> >>
> >> >> Census has client plugins to collect information[2]. At the moment, it
> >> >> has plugins for:
> >> >>
> >> >> * The vendor, device, subsystem_vendor, subsystem_device, and class from
> >> >>   each PCI device
> >> >>
> >> >> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
> >> >>   as well as the bInterfaceClass, bInterfaceSubClass, and
> >> >>   bInterfaceProtocol for each interface
> >> >>
> >> >> * The contents of /etc/os-release
> >> >>
> >> >> * All the RPMs installed on a system
> >> >>
> >> >> Other than the drivers bound to the PCI and USB devices (which is an
> >> >> open PR[3]), what else would be good to collect?
> >> >>
> >> >> [0] https://fedoraproject.org/wiki/Smolt_retirement
> >> >> [1] https://github.com/npmccallum/census
> >> >> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
> >> >> [3] https://github.com/npmccallum/census/pull/3
> >> >
> >> > Internally, we have been focusing on using 'lshw' as the tool that 
> >> > provides
> >> > all that info and handles all the arch funkiness (and includes firmware).
> >> > If there is anything missing, we have tried to push upstream to that
> >> > project.
> >> >
> >> > Would that cover a lot of the info you are looking for?
> >> >
> >> > Cheers,
> >> > Don
> >> ___
> >> kernel mailing list -- kernel@lists.fedoraproject.org
> >> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
> ___
> kernel mailing list -- kernel@lists.fedoraproject.org
> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Nathaniel McCallum
I just looked at the code for lshw. The master branch already supports
JSON. We just need them to release it.

On Wed, Nov 8, 2017 at 3:23 PM, Don Zickus  wrote:
> On Wed, Nov 08, 2017 at 03:16:24PM -0500, Nathaniel McCallum wrote:
>> I just played around with lshw a bit. We should totally make it export
>> JSON. We can then submit this directly (as one census plugin).
>
> Yes, that is how we use it to update hardware info internally to our Beaker
> instance. :-)
>
> Cheers,
> Don
>
>>
>> On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  wrote:
>> > On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
>> >> Hey folks,
>> >>
>> >> For some time now, Fedora has operated without a database of hardware
>> >> users have. Smolt, the old hardware database, was retired in 2012[0] and
>> >> its intended successor[1] was never deployed by Fedora Infrastructure.
>> >>
>> >> It would be nice to have a hardware database, so I (and hopefully some
>> >> others) would like to get Census up and running for Fedora. Before we
>> >> look at deploying Census, however, it would be good to make sure it has
>> >> everything we need.
>> >>
>> >> Census has client plugins to collect information[2]. At the moment, it
>> >> has plugins for:
>> >>
>> >> * The vendor, device, subsystem_vendor, subsystem_device, and class from
>> >>   each PCI device
>> >>
>> >> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>> >>   as well as the bInterfaceClass, bInterfaceSubClass, and
>> >>   bInterfaceProtocol for each interface
>> >>
>> >> * The contents of /etc/os-release
>> >>
>> >> * All the RPMs installed on a system
>> >>
>> >> Other than the drivers bound to the PCI and USB devices (which is an
>> >> open PR[3]), what else would be good to collect?
>> >>
>> >> [0] https://fedoraproject.org/wiki/Smolt_retirement
>> >> [1] https://github.com/npmccallum/census
>> >> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
>> >> [3] https://github.com/npmccallum/census/pull/3
>> >
>> > Internally, we have been focusing on using 'lshw' as the tool that provides
>> > all that info and handles all the arch funkiness (and includes firmware).
>> > If there is anything missing, we have tried to push upstream to that
>> > project.
>> >
>> > Would that cover a lot of the info you are looking for?
>> >
>> > Cheers,
>> > Don
>> ___
>> kernel mailing list -- kernel@lists.fedoraproject.org
>> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Nathaniel McCallum
External plugins? No. We are talking about internal modular interfaces
used to separate the code conceptually. This allows us to delegate
data collection easily to domain experts. It also allows users to
choose, somewhat coursely, which day they report. For example, some
users may be fine with submitting hardware but not a list of their
installed packages. Dividing the data cleanly enables this.

On Wed, Nov 8, 2017 at 3:59 PM, Jeremy Cline  wrote:
> On 11/08/2017 03:18 PM, Nathaniel McCallum wrote:
>> I agree completely. My point is not that we don't need any planning,
>> but that the planing is scoped per plugin.
>
> Do we really need the concept of plugins, though? Are there going to be
> plugins that live outside of the census "core"? Will users want to mix
> and match plugins? Are all the plugins combined more than most users
> want to install? I don't feel like the answer to any of those questions
> is "yes".
>
> I'm all for defining solid internal interfaces so it's easy to extend,
> and I expect users will want to be able to limit what they send in
> (maybe just hardware and no software report), but neither of those
> things seem like they require plugins.
>
> Perhaps I'm being short-sighted. I wasn't around for Smolt and so it's
> hard for me to know all the pain-points of its design. It just seems to
> me that scoping planning to individual pieces of the system is going to
> lead to a whole bunch of disconnected buckets of data that won't be
> pleasant to work with.
>
>
> --
> Jeremy Cline
> XMPP: jer...@jcline.org
> IRC:  jcline
>
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Jeremy Cline
On 11/08/2017 03:18 PM, Nathaniel McCallum wrote:
> I agree completely. My point is not that we don't need any planning,
> but that the planing is scoped per plugin.

Do we really need the concept of plugins, though? Are there going to be
plugins that live outside of the census "core"? Will users want to mix
and match plugins? Are all the plugins combined more than most users
want to install? I don't feel like the answer to any of those questions
is "yes".

I'm all for defining solid internal interfaces so it's easy to extend,
and I expect users will want to be able to limit what they send in
(maybe just hardware and no software report), but neither of those
things seem like they require plugins.

Perhaps I'm being short-sighted. I wasn't around for Smolt and so it's
hard for me to know all the pain-points of its design. It just seems to
me that scoping planning to individual pieces of the system is going to
lead to a whole bunch of disconnected buckets of data that won't be
pleasant to work with.


-- 
Jeremy Cline
XMPP: jer...@jcline.org
IRC:  jcline

___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Don Zickus
On Wed, Nov 08, 2017 at 03:16:24PM -0500, Nathaniel McCallum wrote:
> I just played around with lshw a bit. We should totally make it export
> JSON. We can then submit this directly (as one census plugin).

Yes, that is how we use it to update hardware info internally to our Beaker
instance. :-)

Cheers,
Don

> 
> On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  wrote:
> > On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
> >> Hey folks,
> >>
> >> For some time now, Fedora has operated without a database of hardware
> >> users have. Smolt, the old hardware database, was retired in 2012[0] and
> >> its intended successor[1] was never deployed by Fedora Infrastructure.
> >>
> >> It would be nice to have a hardware database, so I (and hopefully some
> >> others) would like to get Census up and running for Fedora. Before we
> >> look at deploying Census, however, it would be good to make sure it has
> >> everything we need.
> >>
> >> Census has client plugins to collect information[2]. At the moment, it
> >> has plugins for:
> >>
> >> * The vendor, device, subsystem_vendor, subsystem_device, and class from
> >>   each PCI device
> >>
> >> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
> >>   as well as the bInterfaceClass, bInterfaceSubClass, and
> >>   bInterfaceProtocol for each interface
> >>
> >> * The contents of /etc/os-release
> >>
> >> * All the RPMs installed on a system
> >>
> >> Other than the drivers bound to the PCI and USB devices (which is an
> >> open PR[3]), what else would be good to collect?
> >>
> >> [0] https://fedoraproject.org/wiki/Smolt_retirement
> >> [1] https://github.com/npmccallum/census
> >> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
> >> [3] https://github.com/npmccallum/census/pull/3
> >
> > Internally, we have been focusing on using 'lshw' as the tool that provides
> > all that info and handles all the arch funkiness (and includes firmware).
> > If there is anything missing, we have tried to push upstream to that
> > project.
> >
> > Would that cover a lot of the info you are looking for?
> >
> > Cheers,
> > Don
> ___
> kernel mailing list -- kernel@lists.fedoraproject.org
> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Nathaniel McCallum
I agree completely. My point is not that we don't need any planning,
but that the planing is scoped per plugin.

On Wed, Nov 8, 2017 at 3:05 PM, Jeremy Cline  wrote:
> On 11/08/2017 09:24 AM, Nathaniel McCallum wrote:
>> Here is why I don't think we need to have all the data collection
>> requirements up front. Clevis is designed to be very modular. A data
>> collector plugin is just an executable that outputs a JSON blob. A
>> corresponding server-side plugin parses this data and stores it in the
>> database in an efficient way to be later queried.
>
> I certainly don't expect to walk away from this with everything everyone
> wants (and nothing people don't want), but it doesn't hurt to spend a
> little time up-front thinking about this.
>
> Since we're going to be putting this in a database and hopefully we have
> a lot of data, thinking about what that data will look like now will
> save us and those who have to maintain this system in production a great
> deal of pain. It looks like that current implementation uses MongoDB,
> which could either be a good choice or a bad choice depending on what
> the data schema looks like. If it is a good choice, we need to be able
> to prove that since Fedora Infrastructure uses PostgreSQL for pretty
> much everything and they probably won't be thrilled about maintaining
> another database.
>
> If we opt for PostgreSQL that means we really do have to model our data
> (I think is a good thing to do even with something like MongoDB) so we
> should have a pretty solid idea of what it looks like.
>
>
> --
> Jeremy Cline
> XMPP: jer...@jcline.org
> IRC:  jcline
>
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Nathaniel McCallum
I just played around with lshw a bit. We should totally make it export
JSON. We can then submit this directly (as one census plugin).

On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  wrote:
> On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
>> Hey folks,
>>
>> For some time now, Fedora has operated without a database of hardware
>> users have. Smolt, the old hardware database, was retired in 2012[0] and
>> its intended successor[1] was never deployed by Fedora Infrastructure.
>>
>> It would be nice to have a hardware database, so I (and hopefully some
>> others) would like to get Census up and running for Fedora. Before we
>> look at deploying Census, however, it would be good to make sure it has
>> everything we need.
>>
>> Census has client plugins to collect information[2]. At the moment, it
>> has plugins for:
>>
>> * The vendor, device, subsystem_vendor, subsystem_device, and class from
>>   each PCI device
>>
>> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>>   as well as the bInterfaceClass, bInterfaceSubClass, and
>>   bInterfaceProtocol for each interface
>>
>> * The contents of /etc/os-release
>>
>> * All the RPMs installed on a system
>>
>> Other than the drivers bound to the PCI and USB devices (which is an
>> open PR[3]), what else would be good to collect?
>>
>> [0] https://fedoraproject.org/wiki/Smolt_retirement
>> [1] https://github.com/npmccallum/census
>> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
>> [3] https://github.com/npmccallum/census/pull/3
>
> Internally, we have been focusing on using 'lshw' as the tool that provides
> all that info and handles all the arch funkiness (and includes firmware).
> If there is anything missing, we have tried to push upstream to that
> project.
>
> Would that cover a lot of the info you are looking for?
>
> Cheers,
> Don
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Jeremy Cline
On 11/08/2017 09:24 AM, Nathaniel McCallum wrote:
> Here is why I don't think we need to have all the data collection
> requirements up front. Clevis is designed to be very modular. A data
> collector plugin is just an executable that outputs a JSON blob. A
> corresponding server-side plugin parses this data and stores it in the
> database in an efficient way to be later queried.

I certainly don't expect to walk away from this with everything everyone
wants (and nothing people don't want), but it doesn't hurt to spend a
little time up-front thinking about this.

Since we're going to be putting this in a database and hopefully we have
a lot of data, thinking about what that data will look like now will
save us and those who have to maintain this system in production a great
deal of pain. It looks like that current implementation uses MongoDB,
which could either be a good choice or a bad choice depending on what
the data schema looks like. If it is a good choice, we need to be able
to prove that since Fedora Infrastructure uses PostgreSQL for pretty
much everything and they probably won't be thrilled about maintaining
another database.

If we opt for PostgreSQL that means we really do have to model our data
(I think is a good thing to do even with something like MongoDB) so we
should have a pretty solid idea of what it looks like.


-- 
Jeremy Cline
XMPP: jer...@jcline.org
IRC:  jcline

___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Josh Boyer
On Wed, Nov 8, 2017 at 2:14 PM, Don Zickus  wrote:
> On Wed, Nov 08, 2017 at 01:48:36PM -0500, Josh Boyer wrote:
>> >> [1] https://github.com/npmccallum/census
>> >> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
>> >> [3] https://github.com/npmccallum/census/pull/3
>> >
>> > Internally, we have been focusing on using 'lshw' as the tool that provides
>> > all that info and handles all the arch funkiness (and includes firmware).
>> > If there is anything missing, we have tried to push upstream to that
>> > project.
>> >
>> > Would that cover a lot of the info you are looking for?
>>
>> It sounds like lshw could provide the output for the local system if
>> someone wrote a census plugin for it.  What it doesn't seem to cover
>> at all is the "gather data and send it somewhere" part, right?
>
> I think it covers part of the 'gather data', no? :-)  I had assumed the
> census tool handles the 'send it' somewhere.

Sorry, I phrased that awkwardly.  I meant "gather the data from
multiple computers and send it to a central localtion".  But I think
we're saying the same thing.

> As part of the kernel CI work I am doing internally, we are trying to figure
> out a more universal way of exchange machine info when providing feedback
> that a test or patch broke.  Lots of folks have been using lshw.  This has
> made it easier to write scripts on top of that output compared to various
> custom output.  It isn't perfect, but it seems to do a reasonable job today.

Right.  Adding a census plugin to consume that could build on top of
it even further.

josh
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Don Zickus
On Wed, Nov 08, 2017 at 01:48:36PM -0500, Josh Boyer wrote:
> >> [1] https://github.com/npmccallum/census
> >> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
> >> [3] https://github.com/npmccallum/census/pull/3
> >
> > Internally, we have been focusing on using 'lshw' as the tool that provides
> > all that info and handles all the arch funkiness (and includes firmware).
> > If there is anything missing, we have tried to push upstream to that
> > project.
> >
> > Would that cover a lot of the info you are looking for?
> 
> It sounds like lshw could provide the output for the local system if
> someone wrote a census plugin for it.  What it doesn't seem to cover
> at all is the "gather data and send it somewhere" part, right?

I think it covers part of the 'gather data', no? :-)  I had assumed the
census tool handles the 'send it' somewhere.

As part of the kernel CI work I am doing internally, we are trying to figure
out a more universal way of exchange machine info when providing feedback
that a test or patch broke.  Lots of folks have been using lshw.  This has
made it easier to write scripts on top of that output compared to various
custom output.  It isn't perfect, but it seems to do a reasonable job today.


Cheers,
Don

> 
> josh
> ___
> kernel mailing list -- kernel@lists.fedoraproject.org
> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Josh Boyer
On Wed, Nov 8, 2017 at 12:34 PM, Don Zickus  wrote:
> On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
>> Hey folks,
>>
>> For some time now, Fedora has operated without a database of hardware
>> users have. Smolt, the old hardware database, was retired in 2012[0] and
>> its intended successor[1] was never deployed by Fedora Infrastructure.
>>
>> It would be nice to have a hardware database, so I (and hopefully some
>> others) would like to get Census up and running for Fedora. Before we
>> look at deploying Census, however, it would be good to make sure it has
>> everything we need.
>>
>> Census has client plugins to collect information[2]. At the moment, it
>> has plugins for:
>>
>> * The vendor, device, subsystem_vendor, subsystem_device, and class from
>>   each PCI device
>>
>> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>>   as well as the bInterfaceClass, bInterfaceSubClass, and
>>   bInterfaceProtocol for each interface
>>
>> * The contents of /etc/os-release
>>
>> * All the RPMs installed on a system
>>
>> Other than the drivers bound to the PCI and USB devices (which is an
>> open PR[3]), what else would be good to collect?
>>
>> [0] https://fedoraproject.org/wiki/Smolt_retirement
>> [1] https://github.com/npmccallum/census
>> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
>> [3] https://github.com/npmccallum/census/pull/3
>
> Internally, we have been focusing on using 'lshw' as the tool that provides
> all that info and handles all the arch funkiness (and includes firmware).
> If there is anything missing, we have tried to push upstream to that
> project.
>
> Would that cover a lot of the info you are looking for?

It sounds like lshw could provide the output for the local system if
someone wrote a census plugin for it.  What it doesn't seem to cover
at all is the "gather data and send it somewhere" part, right?

josh
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Don Zickus
On Tue, Nov 07, 2017 at 10:49:02PM +, Jeremy Cline wrote:
> Hey folks,
> 
> For some time now, Fedora has operated without a database of hardware
> users have. Smolt, the old hardware database, was retired in 2012[0] and
> its intended successor[1] was never deployed by Fedora Infrastructure.
> 
> It would be nice to have a hardware database, so I (and hopefully some
> others) would like to get Census up and running for Fedora. Before we
> look at deploying Census, however, it would be good to make sure it has
> everything we need.
> 
> Census has client plugins to collect information[2]. At the moment, it
> has plugins for:
> 
> * The vendor, device, subsystem_vendor, subsystem_device, and class from
>   each PCI device
> 
> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>   as well as the bInterfaceClass, bInterfaceSubClass, and
>   bInterfaceProtocol for each interface
> 
> * The contents of /etc/os-release
> 
> * All the RPMs installed on a system
> 
> Other than the drivers bound to the PCI and USB devices (which is an
> open PR[3]), what else would be good to collect?
> 
> [0] https://fedoraproject.org/wiki/Smolt_retirement
> [1] https://github.com/npmccallum/census
> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
> [3] https://github.com/npmccallum/census/pull/3

Internally, we have been focusing on using 'lshw' as the tool that provides
all that info and handles all the arch funkiness (and includes firmware).
If there is anything missing, we have tried to push upstream to that
project.

Would that cover a lot of the info you are looking for?

Cheers,
Don
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Nathaniel McCallum
I forgot to post the link to the server-side of the pci plugin:

https://github.com/npmccallum/census/blob/master/libs/census/server/plugins/hardware/pci.py

On Wed, Nov 8, 2017 at 9:24 AM, Nathaniel McCallum
 wrote:
> Here is why I don't think we need to have all the data collection
> requirements up front. Clevis is designed to be very modular. A data
> collector plugin is just an executable that outputs a JSON blob. A
> corresponding server-side plugin parses this data and stores it in the
> database in an efficient way to be later queried.
>
> My goal is to stand up the basic service with some initial limited
> data. Once we prove that this works, domain experts can write the
> plugins to gather the data they want to collect. Here's an example to
> show how easy it would be (this is the PCI device plugin):
>
> https://github.com/npmccallum/census/blob/master/client/plugins/hardware.pci
>
> A really great example is the one Peter Robinson just gave. I know
> nothing about "SoC attached" devices. Nor should I. I can just let
> Peter write the client and server plugins and help review them for
> semantic correctness. He can also craft the queries he wants. But
> since he's the domain expert, he get's to make most of the important
> decisions.
>
> If census does its job right, it totally offloads data collection and
> analysis to the people who know what they are doing with that data.
>
>
> On Tue, Nov 7, 2017 at 5:49 PM, Jeremy Cline  wrote:
>> Hey folks,
>>
>> For some time now, Fedora has operated without a database of hardware
>> users have. Smolt, the old hardware database, was retired in 2012[0] and
>> its intended successor[1] was never deployed by Fedora Infrastructure.
>>
>> It would be nice to have a hardware database, so I (and hopefully some
>> others) would like to get Census up and running for Fedora. Before we
>> look at deploying Census, however, it would be good to make sure it has
>> everything we need.
>>
>> Census has client plugins to collect information[2]. At the moment, it
>> has plugins for:
>>
>> * The vendor, device, subsystem_vendor, subsystem_device, and class from
>>   each PCI device
>>
>> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>>   as well as the bInterfaceClass, bInterfaceSubClass, and
>>   bInterfaceProtocol for each interface
>>
>> * The contents of /etc/os-release
>>
>> * All the RPMs installed on a system
>>
>> Other than the drivers bound to the PCI and USB devices (which is an
>> open PR[3]), what else would be good to collect?
>>
>> [0] https://fedoraproject.org/wiki/Smolt_retirement
>> [1] https://github.com/npmccallum/census
>> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
>> [3] https://github.com/npmccallum/census/pull/3
>>
>>
>> --
>> Jeremy Cline
>> XMPP: jer...@jcline.org
>> IRC:  jcline
>>
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Nathaniel McCallum
Here is why I don't think we need to have all the data collection
requirements up front. Clevis is designed to be very modular. A data
collector plugin is just an executable that outputs a JSON blob. A
corresponding server-side plugin parses this data and stores it in the
database in an efficient way to be later queried.

My goal is to stand up the basic service with some initial limited
data. Once we prove that this works, domain experts can write the
plugins to gather the data they want to collect. Here's an example to
show how easy it would be (this is the PCI device plugin):

https://github.com/npmccallum/census/blob/master/client/plugins/hardware.pci

A really great example is the one Peter Robinson just gave. I know
nothing about "SoC attached" devices. Nor should I. I can just let
Peter write the client and server plugins and help review them for
semantic correctness. He can also craft the queries he wants. But
since he's the domain expert, he get's to make most of the important
decisions.

If census does its job right, it totally offloads data collection and
analysis to the people who know what they are doing with that data.


On Tue, Nov 7, 2017 at 5:49 PM, Jeremy Cline  wrote:
> Hey folks,
>
> For some time now, Fedora has operated without a database of hardware
> users have. Smolt, the old hardware database, was retired in 2012[0] and
> its intended successor[1] was never deployed by Fedora Infrastructure.
>
> It would be nice to have a hardware database, so I (and hopefully some
> others) would like to get Census up and running for Fedora. Before we
> look at deploying Census, however, it would be good to make sure it has
> everything we need.
>
> Census has client plugins to collect information[2]. At the moment, it
> has plugins for:
>
> * The vendor, device, subsystem_vendor, subsystem_device, and class from
>   each PCI device
>
> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>   as well as the bInterfaceClass, bInterfaceSubClass, and
>   bInterfaceProtocol for each interface
>
> * The contents of /etc/os-release
>
> * All the RPMs installed on a system
>
> Other than the drivers bound to the PCI and USB devices (which is an
> open PR[3]), what else would be good to collect?
>
> [0] https://fedoraproject.org/wiki/Smolt_retirement
> [1] https://github.com/npmccallum/census
> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
> [3] https://github.com/npmccallum/census/pull/3
>
>
> --
> Jeremy Cline
> XMPP: jer...@jcline.org
> IRC:  jcline
>
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Peter Robinson
On Tue, Nov 7, 2017 at 10:49 PM, Jeremy Cline  wrote:
> Hey folks,
>
> For some time now, Fedora has operated without a database of hardware
> users have. Smolt, the old hardware database, was retired in 2012[0] and
> its intended successor[1] was never deployed by Fedora Infrastructure.
>
> It would be nice to have a hardware database, so I (and hopefully some
> others) would like to get Census up and running for Fedora. Before we
> look at deploying Census, however, it would be good to make sure it has
> everything we need.
>
> Census has client plugins to collect information[2]. At the moment, it
> has plugins for:
>
> * The vendor, device, subsystem_vendor, subsystem_device, and class from
>   each PCI device
>
> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>   as well as the bInterfaceClass, bInterfaceSubClass, and
>   bInterfaceProtocol for each interface
>
> * The contents of /etc/os-release
>
> * All the RPMs installed on a system
>
> Other than the drivers bound to the PCI and USB devices (which is an
> open PR[3]), what else would be good to collect?

On ARM any platform "SoC attached" would be useful else you'll get
almost nothing as most NICs/SATA/storage/GPU/cameras etc are generally
not attached via USB/PCI. This would also include things like SDIO
(for wifi), and I suppose i2c/gpio would be useful in that context
too.

Peter
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org


Re: Reviving the hardware census

2017-11-08 Thread Bastien Nocera


- Original Message -
> Hey folks,
> 
> For some time now, Fedora has operated without a database of hardware
> users have. Smolt, the old hardware database, was retired in 2012[0] and
> its intended successor[1] was never deployed by Fedora Infrastructure.
> 
> It would be nice to have a hardware database, so I (and hopefully some
> others) would like to get Census up and running for Fedora. Before we
> look at deploying Census, however, it would be good to make sure it has
> everything we need.
> 
> Census has client plugins to collect information[2]. At the moment, it
> has plugins for:
> 
> * The vendor, device, subsystem_vendor, subsystem_device, and class from
>   each PCI device
> 
> * The idVendor, idProduct, bcdDevice, and bDeviceClass for USB devices
>   as well as the bInterfaceClass, bInterfaceSubClass, and
>   bInterfaceProtocol for each interface
> 
> * The contents of /etc/os-release
> 
> * All the RPMs installed on a system
> 
> Other than the drivers bound to the PCI and USB devices (which is an
> open PR[3]), what else would be good to collect?

I2C devices. PCI and USB would be pretty much everything you'd get on
a desktop or old-school Intel laptop, but for SoC tablets, convertibles
and low-powered laptops, this wouldn't cover much.

The ACPI DSDT would also be useful in figuring out what devices are
unsupported.

> [0] https://fedoraproject.org/wiki/Smolt_retirement
> [1] https://github.com/npmccallum/census
> [2] https://github.com/npmccallum/census/blob/master/client/plugins/
> [3] https://github.com/npmccallum/census/pull/3
> 
> 
> --
> Jeremy Cline
> XMPP: jer...@jcline.org
> IRC:  jcline
> 
> 
> ___
> kernel mailing list -- kernel@lists.fedoraproject.org
> To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
> 
___
kernel mailing list -- kernel@lists.fedoraproject.org
To unsubscribe send an email to kernel-le...@lists.fedoraproject.org