[ 
https://issues.apache.org/jira/browse/SOLR-9480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475025#comment-16475025
 ] 

Hoss Man commented on SOLR-9480:
--------------------------------

Updated patch...

This includes cleanup of some test & javadoc nocommits, but the biggest change 
is renaming {{skg(...)}} to {{relatedness(...)}} -- that's the best name I 
could come up with.

It occured to me I never really posted a full example of what generating an SKG 
looks like with this approach of implementing relatedness as an Aggregate 
function, so here's a complete request/response example using stackexchange 
"scifi" data...

{noformat}
curl -sS -X POST http://localhost:8983/solr/scifi/query -d 
'rows=0&q=type:QUESTION&fore=body:%22harry+potter%22&back=*:*&json.facet={
  tags : {
    type : terms,
    field : tags,
    limit : 5,
    sort : { skg: desc },
    facet : {
      skg : "relatedness($fore,$back)",
      body : {
        type : terms,
        field : body,
        limit : 5,
          sort : { skg: desc },
          facet : {
            skg : "relatedness($fore,$back)"
          }
      }
    }
  }
}'
{noformat}


{noformat}
{
  "responseHeader":{
    "status":0,
    "QTime":4402,
    "params":{
      "q":"type:QUESTION",
      "json.facet":"{\n  tags : {\n    type : terms,\n    field : tags,\n    
limit : 5,\n    sort : { skg: desc },\n    facet : {\n      skg : 
\"relatedness($fore,$back)\",\n      body : {\n        type : terms,\n  field : 
body,\n        limit : 5,\n    sort : { skg: desc },\n    facet : {\n      skg 
: \"relatedness($fore,$back)\"\n    }\n      }\n    }\n  }\n}",
      "back":"*:*",
      "rows":"0",
      "fore":"body:\"harry potter\""}},
  "response":{"numFound":46598,"start":0,"docs":[]
  },
  "facets":{
    "count":46598,
    "tags":{
      "buckets":[{
          "val":"harry-potter",
          "count":5141,
          "skg":{
            "relatedness":0.70795,
            "foreground_popularity":0.01113,
            "background_popularity":0.03627},
          "body":{
            "buckets":[{
                "val":"potter",
                "count":1715,
                "skg":{
                  "relatedness":0.83699,
                  "foreground_popularity":0.01113,
                  "background_popularity":0.03555}},
              {
                "val":"harry",
                "count":2944,
                "skg":{
                  "relatedness":0.76488,
                  "foreground_popularity":0.01113,
                  "background_popularity":0.07392}},
              {
                "val":"deathly",
                "count":516,
                "skg":{
                  "relatedness":0.41314,
                  "foreground_popularity":0.0017,
                  "background_popularity":0.01308}},
              {
                "val":"hallows",
                "count":525,
                "skg":{
                  "relatedness":0.4125,
                  "foreground_popularity":0.00171,
                  "background_popularity":0.01333}},
              {
                "val":"hogwarts",
                "count":1061,
                "skg":{
                  "relatedness":0.39054,
                  "foreground_popularity":0.00229,
                  "background_popularity":0.02585}}]}},
        {
          "val":"jk-rowling",
          "count":107,
          "skg":{
            "relatedness":0.23501,
            "foreground_popularity":3.7E-4,
            "background_popularity":7.5E-4},
          "body":{
            "buckets":[{
                "val":"attender",
                "count":1,
                "skg":{
                  "relatedness":0.4322,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"escapers",
                "count":1,
                "skg":{
                  "relatedness":0.4322,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"l'etat",
                "count":1,
                "skg":{
                  "relatedness":0.4322,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"mugglenet's",
                "count":1,
                "skg":{
                  "relatedness":0.4322,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"pocketeded",
                "count":1,
                "skg":{
                  "relatedness":0.4322,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}}]}},
        {
          "val":"the-cursed-child",
          "count":60,
          "skg":{
            "relatedness":0.23294,
            "foreground_popularity":2.7E-4,
            "background_popularity":4.2E-4},
          "body":{
            "buckets":[{
                "val":"cursed",
                "count":45,
                "skg":{
                  "relatedness":0.6238,
                  "foreground_popularity":2.6E-4,
                  "background_popularity":0.00459}},
              {
                "val":"delphi",
                "count":10,
                "skg":{
                  "relatedness":0.50766,
                  "foreground_popularity":5.0E-5,
                  "background_popularity":2.9E-4}},
              {
                "val":"scorpius",
                "count":14,
                "skg":{
                  "relatedness":0.48154,
                  "foreground_popularity":7.0E-5,
                  "background_popularity":6.9E-4}},
              {
                "val":"neutralising",
                "count":1,
                "skg":{
                  "relatedness":0.479,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"noselessness",
                "count":1,
                "skg":{
                  "relatedness":0.479,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}}]}},
        {
          "val":"voldemort",
          "count":460,
          "skg":{
            "relatedness":0.21765,
            "foreground_popularity":7.6E-4,
            "background_popularity":0.00324},
          "body":{
            "buckets":[{
                "val":"potter",
                "count":118,
                "skg":{
                  "relatedness":0.44277,
                  "foreground_popularity":7.6E-4,
                  "background_popularity":0.03555}},
              {
                "val":"voldemort",
                "count":384,
                "skg":{
                  "relatedness":0.42619,
                  "foreground_popularity":6.7E-4,
                  "background_popularity":0.03074}},
              {
                "val":"harry",
                "count":278,
                "skg":{
                  "relatedness":0.33236,
                  "foreground_popularity":7.6E-4,
                  "background_popularity":0.07392}},
              {
                "val":"948",
                "count":1,
                "skg":{
                  "relatedness":0.32771,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"chernyshov",
                "count":1,
                "skg":{
                  "relatedness":0.32771,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}}]}},
        {
          "val":"spells",
          "count":175,
          "skg":{
            "relatedness":0.19104,
            "foreground_popularity":4.0E-4,
            "background_popularity":0.00123},
          "body":{
            "buckets":[{
                "val":"bitingly",
                "count":1,
                "skg":{
                  "relatedness":0.42157,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"centrari",
                "count":1,
                "skg":{
                  "relatedness":0.42157,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"counterspelling",
                "count":1,
                "skg":{
                  "relatedness":0.42157,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"effectivly",
                "count":1,
                "skg":{
                  "relatedness":0.42157,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}},
              {
                "val":"expelliarus",
                "count":1,
                "skg":{
                  "relatedness":0.42157,
                  "foreground_popularity":1.0E-5,
                  "background_popularity":1.0E-5}}]}}]}}}
{noformat}



Now that the randomized tests seem really reliable, I'll work on refactoring 
the Slot collection vs distributed Merging to reduce code duplication ... but 
in general I think this is getting really close to being committable.


> Graph Traversal for Significantly Related Terms (Semantic Knowledge Graph)
> --------------------------------------------------------------------------
>
>                 Key: SOLR-9480
>                 URL: https://issues.apache.org/jira/browse/SOLR-9480
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Trey Grainger
>            Priority: Major
>         Attachments: SOLR-9480.patch, SOLR-9480.patch, SOLR-9480.patch, 
> SOLR-9480.patch
>
>
> This issue is to track the contribution of the Semantic Knowledge Graph Solr 
> Plugin (request handler), which exposes a graph-like interface for 
> discovering and traversing significant relationships between entities within 
> an inverted index.
> This data model has been described in the following research paper: [The 
> Semantic Knowledge Graph: A compact, auto-generated model for real-time 
> traversal and ranking of any relationship within a 
> domain|https://arxiv.org/abs/1609.00464], as well as in presentations I gave 
> in October 2015 at [Lucene/Solr 
> Revolution|http://www.slideshare.net/treygrainger/leveraging-lucenesolr-as-a-knowledge-graph-and-intent-engine]
>  and November 2015 at the [Bay Area Search 
> Meetup|http://www.treygrainger.com/posts/presentations/searching-on-intent-knowledge-graphs-personalization-and-contextual-disambiguation/].
> The source code for this project is currently available at 
> [https://github.com/careerbuilder/semantic-knowledge-graph], and the folks at 
> CareerBuilder (where this was built) have given me the go-ahead to now 
> contribute this back to the Apache Solr Project, as well.
> Check out the Github repository, research paper, or presentations for a more 
> detailed description of this contribution. Initial patch coming soon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to