[ 
https://issues.apache.org/jira/browse/ANY23-336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16416142#comment-16416142
 ] 

Lewis John McGibbney commented on ANY23-336:
--------------------------------------------

The embedded JSON-LD looks as follows
{code}
[
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Fitness on the Green: Zumba",
    "description": "<p> Fitness on the Green: Zumba Sunday 
10:30 - 11:30 AM TOTAL BLAST ZUMBA with Ariane Betancourt returns to 
Guthrie Green! When you see this outdoors Zumba\\u00ae class in action, 
you won\\'t wait to give it a try. As soon as you start this Latin-inspired, 
easy-to-follow, dance-fitness party, you will feel your body surge with energy. 
You will get </p>\n",
    "image": "/wp-content/uploads/2018/03/gcal_event_zumba_class.jpg",
    "url": "/event/fitness-on-the-green-zumba/",
    "startDate": "2018-04-01T15:30:00+00:00",
    "endDate": "2018-04-01T16:30:00+00:00"
  },
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Fitness on the Green: Yoga",
    "description": "<p> Fitness on the Green: YogaMonday 5:30 
\\u2013 6:30 PM & Wednesday 5:30 \\u2013 
6:30 PMJoin the YMCA of Greater Tulsa and Fowler Toyota for this relaxing 
addition to Fitness on the Green. Every Monday and Wednesday this season from 
5:30 PM to 6:30 PM at Guthrie Green. Bring your own mat and as always, class is 
FREE! Physical posture (Asana) and conscious </p>\n",
    "image": "/wp-content/uploads/2018/03/gcal_event_Yoga.jpg",
    "url": "/event/fitness-on-the-green-yoga/",
    "startDate": "2018-04-02T22:30:00+00:00",
    "endDate": "2018-04-02T23:30:00+00:00"
  },
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Fitness on the Green: Defend Together",
    "description": "<p> Fitness on the Green: 
Defend TogetherMonday 6:30 \\u2013 7:30 PMJoin the YMCA of Greater Tulsa 
and Fowler Toyota\\'s DEFEND TOGETHER fitness class. Every Monday night 
following Yoga from 6:30 to 7:30 PM at Guthrie Green. As always, class is 
FREE! DEFEND TOGETHER is an hour of exciting interval training that burns 
calories and builds total body strength. </p>\n",
    "image": "/wp-content/uploads/2018/03/gcal_event_bootcamp2.jpg",
    "url": "/event/fitness-on-the-green-defend-together/",
    "startDate": "2018-04-02T23:30:00+00:00",
    "endDate": "2018-04-03T00:30:00+00:00"
  },
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Fitness on the Green: Bootcamp",
    "description": "<p>Description: Fitness on the Green: 
BootcampTuesday 5:30 \\u2013 6:30 PM & Thursday 5:30 \\u2013 6:30 PM This 
one hour signature class is acomplete total body workout to help you get in 
shape or challenge your body toreach its fullest potential! This outdoor boot 
camp style class is a highintensity mix of cardio endurance, plyometric and 
agility </p>\n",
    "image": "/wp-content/uploads/2018/03/gcal_event_bootcamp3.jpg",
    "url": "/event/fitness-on-the-green-bootcamp/",
    "startDate": "2018-04-03T22:30:00+00:00",
    "endDate": "2018-04-03T23:30:00+00:00"
  },
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Fitness on the Green: Partner Power",
    "description": "<p> Fitness on the Green: Partner PowerWednesday 
6:00 - 7:00 AM Challenge yourself with this athletic-based 
cardio/strength-training circuit workout set up in stations. Time flies as you 
make your way around the circuit and walk (or crawl) away with a complete 
full-body workout. The fun is experienced through a buddy workout, bud don\\'t 
worry if you </p>\n",
    "image": 
"/wp-content/uploads/2018/03/gcal_event_Couple-Partner-wod-workout-iStock-813910678.jpg",
    "url": "/event/fitness-on-the-green-partner-power/",
    "startDate": "2018-04-04T11:00:00+00:00",
    "endDate": "2018-04-04T12:00:00+00:00"
  },
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Food Truck Wednesday",
    "description": "<p>Chow down at Guthrie Green every Wednesday for 
lunch! Live music, amazing food trucks, and fresh air \\u2013 what could be 
better? Join us every Wednesday between 11:30 am and 1:30 pm for a break from 
the office or a fun outing with the family!Every LAST Wednesday of the month is 
WILD CARD WEDNESDAY. Come </p>\n",
    "image": 
"/wp-content/uploads/2017/09/gcal_event_FoodTruckWed_GG_ShaneBrown_061114_LARGE-1.jpg",
    "url": "/event/food-truck-wednesday-94/",
    "startDate": "2018-04-04T16:30:00+00:00",
    "endDate": "2018-04-04T18:30:00+00:00"
  },
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Fitness on the Green: Yoga",
    "description": "<p> Fitness on the Green: YogaMonday 5:30 
\\u2013 6:30 PM & Wednesday 5:30 \\u2013 
6:30 PMJoin the YMCA of Greater Tulsa and Fowler Toyota for this relaxing 
addition to Fitness on the Green. Every Monday and Wednesday this season from 
5:30 PM to 6:30 PM at Guthrie Green. Bring your own mat and as always, class is 
FREE! Physical posture (Asana) and conscious </p>\n",
    "image": "/wp-content/uploads/2018/03/gcal_event_Yoga.jpg",
    "url": "/event/fitness-on-the-green-yoga-2/",
    "startDate": "2018-04-04T22:30:00+00:00",
    "endDate": "2018-04-04T23:30:00+00:00"
  },
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Fitness on the Green: Bootcamp",
    "description": "<p>Description: Fitness on the Green: 
BootcampTuesday 5:30 \\u2013 6:30 PM & Thursday 5:30 \\u2013 6:30 PM This 
one hour signature class is acomplete total body workout to help you get in 
shape or challenge your body toreach its fullest potential! This outdoor boot 
camp style class is a highintensity mix of cardio endurance, plyometric and 
agility </p>\n",
    "image": "/wp-content/uploads/2018/03/gcal_event_bootcamp3.jpg",
    "url": "/event/fitness-on-the-green-bootcamp-2/",
    "startDate": "2018-04-05T22:30:00+00:00",
    "endDate": "2018-04-05T23:30:00+00:00"
  },
  {
    "@context": "http://schema.org";,
    "@type": "Event",
    "name": "Hip Hop 918: Kickin’ It Oldschool",
    "description": "<p>Guthrie Green is excited to announce the first 
ever Hip Hop 918 event, happening Saturday, April 7, 2018 from 7-10 
pm! \\u201cHip Hop 918: Kickin\\u2019 It Old School\\u201d is a Guthrie 
Green production, featuring a top-tier lineup that includes Doug E Fresh, Biz 
Markie and Big Daddy Kane. In addition the event will showcase local musical 
</p>\n",
    "image": "/wp-content/uploads/2018/03/gcal_event_GG-Event-Page.jpg",
    "url": "/event/hip-hop-918-kickin-it-oldschool/",
    "startDate": "2018-04-08T00:00:00+00:00",
    "endDate": "2018-04-08T03:00:00+00:00"
  }
]
{code}

> Parsing json-ld content takes prohibitively long time
> -----------------------------------------------------
>
>                 Key: ANY23-336
>                 URL: https://issues.apache.org/jira/browse/ANY23-336
>             Project: Apache Any23
>          Issue Type: Bug
>          Components: core, extractors
>    Affects Versions: 2.2
>            Reporter: Hans Brende
>            Priority: Critical
>             Fix For: 2.3
>
>
> Using the page [https://www.guthriegreen.com|https://www.guthriegreen.com/] 
> as a benchmark, a page fetch took about 100 ms, while simply *parsing* the 
> json-ld content on that page took a *staggering 27400 ms*. For reference, I'm 
> using Java 8, build 162, on a Macbook Pro (early 2015).
> The bad news is that this is not our fault.
> I've profiled this behavior down to the 
> {{com.github.jsonldjava.utils.JsonUtils.fromURL(URL, CloseableHttpClient)}} 
> function. 94% of the parsing time is spent there. This function is called 
> when trying to load remote json-ld contexts. 
> In order to avoid loading remote contexts repeatedly, this function tries to 
> *cache* them by using a {{CachingHttpClient}} from the httpclient-osgi 
> library.
> Unfortunately, that strategy is *not* working, as I have recorded exactly 
> *zero* cache hits, meaning that *every* retrieval is a cache miss and a 
> remote context is re-fetched via http every single time it's accessed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to