20 nov 2007 kl. 20.28 skrev Doug Cutting:
karl wettin wrote:
On Nov 15, 2007 10:09 PM, Grant Ingersoll <[EMAIL PROTECTED]>
wrote:
it is always good to have query logs
http://thepiratebay.org/tor/3783572
It doesn't look as though there's click data, so we can't use this
for relevance exp
karl wettin wrote:
On Nov 15, 2007 10:09 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
it is always good to have query logs
I realize that it is not that politically correct, but the TPB
collection is released to the public domain and contains 3.2 million
user queries with session id, timesta
: I think the safest path is simply to not publish any queries, but rather to,
: e.g., permit committers to run experiments using them and publish the results
: of the experiments. But no queries would be made available to the general
: public on a website.
that would eliminate the goal of havin
This may be worth asking legal-discuss about. I am not sure if there
is an issue or not.
-Grant
On Nov 20, 2007, at 4:54 AM, karl wettin wrote:
On Nov 15, 2007 10:09 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
it is always good to have query logs
I realize that it is not that politic
On Nov 15, 2007 10:09 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> it is always good to have query logs
I realize that it is not that politically correct, but the TPB
collection is released to the public domain and contains 3.2 million
user queries with session id, timestamp, category etc to g
Chris Hostetter wrote:
right ... i'm not suggesting we do this in an automatic un-human-involved
way; i'm suggesting that a "trusted" person generate this report,
ignore anything with a count less then some number (both to remove noise,
and eliminate most of the random "identifiable" queries),
On Nov 19, 2007, at 3:41 PM, Chris Hostetter wrote:
: info, etc. could be stripped fairly easily. So, we wouldn't
necessarily know
: who is searching for "Yonik Seeley" when we see that query term,
just that it
: was searched for. Maybe we can inquire to infrastructure what is
even
: info, etc. could be stripped fairly easily. So, we wouldn't necessarily know
: who is searching for "Yonik Seeley" when we see that query term, just that it
: was searched for. Maybe we can inquire to infrastructure what is even
It's a largely theoretical arguement (particularly relating to
I'm not sure where the personal info is leaked, we aren't proposing to
make who made the query available, just what the query is and I
suspect the IP info, etc. could be stripped fairly easily. So, we
wouldn't necessarily know who is searching for "Yonik Seeley" when we
see that query ter
: > report of (querystring,accesscount)->url mappings based on requests that
: > had a major search engine as the refer URL, that should be fine right?
:
: Query strings can leak personal info too (think of someone googling
: themselves or their SSN)
right ... i'm not suggesting we do this in an
On Nov 19, 2007 1:29 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote:
> : Note that logs are generally considered private data. So we could not make
> : these available to the general public, but only to folks who've somehow
> sworn
> : to keep them private.
>
> but in theory, it would be okay to m
: Note that logs are generally considered private data. So we could not make
: these available to the general public, but only to folks who've somehow sworn
: to keep them private.
but in theory, it would be okay to make aggregated info from the logs
available right? ie: we don't want to make
Note that logs are generally considered private data. So we could not
make these available to the general public, but only to folks who've
somehow sworn to keep them private.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
F
Not so sure about relevance, but it is always good to have query logs
and we have the data, so we could start building up relevance
judgments over time based on the data. Might be good for demos and
other stuff too.
-Grant
On Nov 15, 2007, at 3:20 PM, Mike Klaas wrote:
On 15-Nov-07, at
On 15-Nov-07, at 5:33 AM, Grant Ingersoll wrote:
Would people be interested in asking infrastructure to see if we
can get our hands on things like JIRA search logs and any other
search/query logs available? I'm thinking if we had this, plus the
underlying data, we could start to use this i
Would people be interested in asking infrastructure to see if we can
get our hands on things like JIRA search logs and any other search/
query logs available? I'm thinking if we had this, plus the
underlying data, we could start to use this in a number of places like
benchmark, for testing
16 matches
Mail list logo