On Sat, 2017-05-13 at 19:52 +0200, Robert Varga wrote:
> On 11/05/17 21:18, Casey Cain wrote:
> > Hello, everyone!
> 
> Hello Casey,
> 
> > > Robert Varga stated: 
> > > "One thing I have concern about is the company break-down, where
> > > my
> > > commits are attributed to Cisco only, whereas they should be
> > > attributed
> > > based on the email address:
> > > - [email protected] <mailto:[email protected]> -> Cisco
> > > - *varga@pantheon* -> Pantheon Technologies
> > > - [email protected] <mailto:[email protected]> -> unaffiliated
> > 
> > Currently, we support "sequential affiliations" for these cases.
> > Since
> > addresses are not usually reliable (eg, people start committing for
> > a
> > company with their old personal address), we use dates to determine
> > periods of affiliation. If Robert can provide us with the periods
> > of
> > affiliation for Cisco, Pantheon Technologies, and unaffiliated, we
> > can
> > include that information for his profile.
> 
> I am sorry, that will not work, as the periods are overlapping. I
> think
> the classification rules are simple enough:
> 
> - check against known company emails, if not matched then
> - look up in sequential affiliations, if not matched then
> - attribute to 'Unknown'
> 
> This will provide accurate results for both cases as long as company
> email addresses can be trusted. If that assumption does not hold, I
> am
> afraid we have a larger issue (and a separate topic).

Hi, Robert,

The problem with that approach is that we're tracking activity for
persons in all data sources, so the algorithm is not based on looking
at the email address, since in many cases there is no mailing address
to look at. That means that we first try to find out which identities
correspond to the same person, then we try to find a correct
affiliation for them, based on times, and then we annotate the commit
with it. As a separate (and previous) process, we found the different
ids that a person may have, and the affiliations (which are assumed to
be sequential).

I see that in your case, that assumption about sequential affiliation
does not hold. I'm going to check how we can have your case into
account. From the top of my head, one chance would be not considering
the different identities of persons contributing simultaneously with
different identities to be the same person, and just affiliate each
identity to a different company. I'm going to check if this would work.

Do you have an idea of how many cases like this could happen in your
community?

> I also think we should have a separate 'Individual' category,
> distinct
> from 'Unknown'.

That's fine. We only need to know when to affiliate as "Individual". We
can provide you with a file format in which anyone who should be listed
as "Individual" may insert their identities.

> > > The second thing is that pantheon.sk <http://pantheon.sk> and
> > 
> > pantheon.tech addresses seem
> > > to
> > > be lumped into the 'Unknown' category -- which is very visible in
> > > the
> > > topoprocessing repository.  
> > > What can I do to remedy these?"
> > 
> > Can we assume that we should assign pantheon.sk <http://pantheon.sk
> > > and
> > pantheon.tech to
> > "Pantheon Technologies"?
> 
> Yes. Furthermore, I think these should be clarified for all member
> companies ASAP.

OK. We can provide you with a listing of identities so that you decide
affiliations that are wrong for them, if you want.

Saludos,

        Jesus.

> Regards,
> Robert
> 
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah

_______________________________________________
Discuss mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/discuss

Reply via email to