Re: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Sebastian Schaffert

Am 19.07.2012 um 20:50 schrieb Kingsley Idehen:

 
 I completely understand and appreciate your desire (which I share) to see a 
 mature landscape with a range of linked data sources. I can also understand 
 how a database or spreadsheet can potentially offer fine-grained data access 
 - your examples do illustrate the point very well indeed! 
 
 However, if we want to build a sustainable business, the decision to build 
 these features needs to be demand driven. 
 
 I disagree.

 
 Note, I responded because I assumed this was a new Linked Data service. But 
 it clearly isn't. Thus, I don't want to open up a debate about Linked Data 
 virtues if you incorrectly assume they should be *demand driven*.
 
 Remember, this is the Linked Open Data (LOD) forum. We've long past the issue 
 of *demand driven* over here, re. Linked Data. 

But I agree. A technology that is not able to fire proof its usefulness in a 
demand driven / problem driven environment is maybe interesting from an 
academic standpoint but otherwise not really useful. And if you look at the 
recent troubles with Semantic Web business models you see the consequences.

You are not the only one in the community, so please don't say we've passed 
the issue. I'd say we have not even really started with the issue, we've just 
pushed some technology out there, not knowing yet whether it is really useful. 
On the other hand Harish is giving us one example of where at least part of the 
technology *might* be useful and I appreciate this very much. In general, I 
also prefer acting over talking. ;-)

Considering comments like yours, I really fear for the community to loose its 
openness and acceptance of differing opinions. I had already given up really 
following the discussions here for exactly that reason (and I am not the only 
one), but this message appeared on my phone before the mail client could sort 
it away and simply made me upset.

Greetings,

Sebastian
-- 
| Dr. Sebastian Schaffert  sebastian.schaff...@salzburgresearch.at
| Salzburg Research Forschungsgesellschaft  http://www.salzburgresearch.at
| Head of Knowledge and Media Technologies Group  +43 662 2288 423
| Jakob-Haringer Strasse 5/II
| A-5020 Salzburg



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: position in cancer informatics

2012-07-20 Thread Stefan Decker
The discussion seem to point to a deeper question: how to enable crowd
sourcing of the analysis of these kind of data sets? This may involve
running of analysis code or maybe even manual work.
What kind of computational infrastructure would we need to enable this? And
how do we validate and aggregate results?

On Thursday, 19 July 2012, Helena Deus wrote:

 An on a related topic and the reason why doing cancer informatics is so
 exciting in this area: a happy story where exploring data patterns enabled
 curing a cancer which had a 4-5% survival chance -
 http://www.nytimes.com/2012/07/08/health/in-gene-sequencing-treatment-for-leukemia-glimpses-of-the-future.html?_r=1



 On Jul 19, 2012, at 7:41 PM, Melvin Carvalho wrote:



 On 17 July 2012 22:27, Nathan nat...@webr3.org javascript:_e({},
 'cvml', 'nat...@webr3.org'); wrote:

 Can you open this right up for everybody to be involved?

 I know I for one would be happy to invest free time to looking at these
 datasets to find patterns - are they open and available online, any
 pointers to get started, anything at all that would enable me (and
 hopefully others skilled here) to work on this?

 It sounds like less of a position and more of a global need we who can
 should all be pumping time in to.


 Maybe related:

 15-Year-Old Maker Astronomically Improves Pancreatic Cancer Test


 http://blog.makezine.com/2012/07/18/15-year-old-maker-astronomically-improves-pancreatic-cancer-test/

 He gleaned information on the topic from his “good friend Google,” and
 began his research. Yes, he even got in trouble in his science class for
 reading articles on carbon nanotubes instead of doing his classwork. When
 Andraka had solidified ideas for his novel paper sensor, he wrote out his
 procedure, timeline, and budget, and emailed 200 professors at research
 institutes. He got 199 rejections and one acceptance from Johns Hopkins:
 “If you send out enough emails, someone’s going to say yes.”


 Best,

 Nathan


 Helena Deus wrote:

 Dear all,
 We have an exciting research assistant position open at DERI for a
 chance to work with Cancer Informatics! We are looking for an enthusiastic
 developer who is familiar with bioinformatics concepts. Your role will be
 exploring cancer related datasets and looking for pattern (applying, for
 example, machine learning techniques) that can be used for personalized
 medicine.
 Please don't hesitate to Fw. this to whomever you think might be
 interested.
 To apply or to ask for more information, please reply to me (
 helena.d...@deri.org javascript:_e({}, 'cvml',
 'helena.d...@deri.org');) with CV + motivation letter
 Kind regards, Helena F. Deus, PhD
 Digital Enterprise Research Institute
 helena.d...@deri.org javascript:_e({}, 'cvml',
 'helena.d...@deri.org');










-- 
Professor Stefan Decker
Director, Digital Enterprise Research Institute,
Professor of Digital Enterprise
National University of Ireland, Galway. Ireland.
Tel: +353.91.495011
E-mail: stefan.dec...@deri.org
Web: http://www.deri.ie
Personal: http://www.stefandecker.org


Re: position in cancer informatics

2012-07-20 Thread Jakub Kotowski
It seems to me that more than a computational infrastructure you would
need an efficient way to coordinate a community (communication, resource
(file, software?, ...) sharing and a common way of describing each used
methodology and set of results in order to facilitate subsequent result
validation and aggregation. But maybe that's what you mean by
computational infrastructure? Or are you aiming for something more
automated?

Best,

Jakub


On 07/20/2012 11:22 AM, Stefan Decker wrote:
 The discussion seem to point to a deeper question: how to enable crowd
 sourcing of the analysis of these kind of data sets? This may involve
 running of analysis code or maybe even manual work.
 What kind of computational infrastructure would we need to enable this?
 And how do we validate and aggregate results?
 
 On Thursday, 19 July 2012, Helena Deus wrote:
 
 An on a related topic and the reason why doing cancer informatics is
 so exciting in this area: a happy story where exploring data
 patterns enabled curing a cancer which had a 4-5% survival chance
 - 
 http://www.nytimes.com/2012/07/08/health/in-gene-sequencing-treatment-for-leukemia-glimpses-of-the-future.html?_r=1
 
 
 
 On Jul 19, 2012, at 7:41 PM, Melvin Carvalho wrote:
 


 On 17 July 2012 22:27, Nathan nat...@webr3.org javascript:_e({},
 'cvml', 'nat...@webr3.org'); wrote:

 Can you open this right up for everybody to be involved?

 I know I for one would be happy to invest free time to looking
 at these datasets to find patterns - are they open and
 available online, any pointers to get started, anything at all
 that would enable me (and hopefully others skilled here) to
 work on this?

 It sounds like less of a position and more of a global need
 we who can should all be pumping time in to.


 Maybe related:

 15-Year-Old Maker Astronomically Improves Pancreatic Cancer Test

 
 http://blog.makezine.com/2012/07/18/15-year-old-maker-astronomically-improves-pancreatic-cancer-test/

 He gleaned information on the topic from his “good friend Google,”
 and began his research. Yes, he even got in trouble in his science
 class for reading articles on carbon nanotubes instead of doing
 his classwork. When Andraka had solidified ideas for his novel
 paper sensor, he wrote out his procedure, timeline, and budget,
 and emailed 200 professors at research institutes. He got 199
 rejections and one acceptance from Johns Hopkins: “If you send out
 enough emails, someone’s going to say yes.”


 Best,

 Nathan


 Helena Deus wrote:

 Dear all,
 We have an exciting research assistant position open at
 DERI for a chance to work with Cancer Informatics! We are
 looking for an enthusiastic developer who is familiar with
 bioinformatics concepts. Your role will be exploring
 cancer related datasets and looking for pattern (applying,
 for example, machine learning techniques) that can be used
 for personalized medicine.
 Please don't hesitate to Fw. this to whomever you think
 might be interested.
 To apply or to ask for more information, please reply to
 me (helena.d...@deri.org javascript:_e({}, 'cvml',
 'helena.d...@deri.org');) with CV + motivation letter
 Kind regards, Helena F. Deus, PhD
 Digital Enterprise Research Institute
 helena.d...@deri.org javascript:_e({}, 'cvml',
 'helena.d...@deri.org');







 
 
 
 -- 
 Professor Stefan Decker
 Director, Digital Enterprise Research Institute,
 Professor of Digital Enterprise
 National University of Ireland, Galway. Ireland.
 Tel: +353.91.495011
 E-mail: stefan.dec...@deri.org mailto:stefan.dec...@deri.org
 Web: http://www.deri.ie
 Personal: http://www.stefandecker.org




Re: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Michael Brunnbauer

Hello Sebastian,

I agree 100%

Regards,

Michael Brunnbauer

On Fri, Jul 20, 2012 at 10:06:38AM +0200, Sebastian Schaffert wrote:
 But I agree. A technology that is not able to fire proof its usefulness in a 
 demand driven / problem driven environment is maybe interesting from an 
 academic standpoint but otherwise not really useful. And if you look at the 
 recent troubles with Semantic Web business models you see the consequences.
 
 You are not the only one in the community, so please don't say we've 
 passed the issue. I'd say we have not even really started with the issue, 
 we've just pushed some technology out there, not knowing yet whether it is 
 really useful. On the other hand Harish is giving us one example of where at 
 least part of the technology *might* be useful and I appreciate this very 
 much. In general, I also prefer acting over talking. ;-)
 
 Considering comments like yours, I really fear for the community to loose its 
 openness and acceptance of differing opinions. I had already given up really 
 following the discussions here for exactly that reason (and I am not the only 
 one), but this message appeared on my phone before the mail client could sort 
 it away and simply made me upset.

-- 
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail bru...@netestate.de
++  http://www.netestate.de/
++
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel



RE: position in cancer informatics

2012-07-20 Thread Deus, Helena
WeConsent (http://weconsent.us/about.) is trying to address that through
encouraging people to freely share their own health/genomics data
instead of expecting health care professionals to do so. Supporting the
deposition of this data by the patients may be step #1 towards a
computational infrastructure.

 

From: Stefan Decker [mailto:stefan.dec...@deri.org] 
Sent: 20 July 2012 10:23
To: Deus, Helena
Cc: Melvin Carvalho; nat...@webr3.org; Hausenblas, Michael;
semantic-...@w3.org; public-lod@w3.org; www-rdf-inter...@w3.org;
protege-discuss...@lists.stanford.edu; semantic...@yahoogroups.com;
dbwo...@cs.wisc.edu; machine-learn...@egroups.com;
taverna-us...@lists.sourceforge.net; b...@bioinformatics.org
Subject: Re: position in cancer informatics

 

The discussion seem to point to a deeper question: how to enable crowd
sourcing of the analysis of these kind of data sets? This may involve
running of analysis code or maybe even manual work.

What kind of computational infrastructure would we need to enable this?
And how do we validate and aggregate results?


On Thursday, 19 July 2012, Helena Deus wrote:

An on a related topic and the reason why doing cancer informatics is so
exciting in this area: a happy story where exploring data patterns
enabled curing a cancer which had a 4-5% survival chance -
http://www.nytimes.com/2012/07/08/health/in-gene-sequencing-treatment-fo
r-leukemia-glimpses-of-the-future.html?_r=1

 

 

 

On Jul 19, 2012, at 7:41 PM, Melvin Carvalho wrote:





 

On 17 July 2012 22:27, Nathan nat...@webr3.org
javascript:_e(%7b%7d,%20'cvml',%20'nat...@webr3.org');  wrote:

Can you open this right up for everybody to be involved?

I know I for one would be happy to invest free time to looking at these
datasets to find patterns - are they open and available online, any
pointers to get started, anything at all that would enable me (and
hopefully others skilled here) to work on this?

It sounds like less of a position and more of a global need we who can
should all be pumping time in to.


Maybe related:

15-Year-Old Maker Astronomically Improves Pancreatic Cancer Test

http://blog.makezine.com/2012/07/18/15-year-old-maker-astronomically-imp
roves-pancreatic-cancer-test/

He gleaned information on the topic from his good friend Google, and
began his research. Yes, he even got in trouble in his science class for
reading articles on carbon nanotubes instead of doing his classwork.
When Andraka had solidified ideas for his novel paper sensor, he wrote
out his procedure, timeline, and budget, and emailed 200 professors at
research institutes. He got 199 rejections and one acceptance from Johns
Hopkins: If you send out enough emails, someone's going to say yes.


Best,

Nathan



Helena Deus wrote:

Dear all, 
We have an exciting research assistant position open at DERI for
a chance to work with Cancer Informatics! We are looking for an
enthusiastic developer who is familiar with bioinformatics concepts.
Your role will be exploring cancer related datasets and looking for
pattern (applying, for example, machine learning techniques) that can be
used for personalized medicine. 
Please don't hesitate to Fw. this to whomever you think might be
interested. 
To apply or to ask for more information, please reply to me
(helena.d...@deri.org
javascript:_e(%7b%7d,%20'cvml',%20'helena.d...@deri.org'); ) with CV +
motivation letter 
Kind regards, Helena F. Deus, PhD
Digital Enterprise Research Institute
helena.d...@deri.org
javascript:_e(%7b%7d,%20'cvml',%20'helena.d...@deri.org'); 





 

 

 



-- 
Professor Stefan Decker
Director, Digital Enterprise Research Institute,
Professor of Digital Enterprise
National University of Ireland, Galway. Ireland.
Tel: +353.91.495011
E-mail: stefan.dec...@deri.org
Web: http://www.deri.ie
Personal: http://www.stefandecker.org



Re: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Kingsley Idehen

On 7/19/12 9:13 PM, Mike Bergman wrote:

+1


On 7/19/2012 6:37 PM, glenn mcdonald wrote:

Remember, this is the Linked Open Data (LOD) forum. We've long past
the issue of *demand driven* over here, re. Linked Data.


There's a difference between solving an issue and just refusing to
address it any more. Pity there isn't a another forum for generating
actual demand as reliably as this one supplies scorn.







Mike,

What problem are we refusing to address, as exemplified by : 
http://semgel.com/ .


If I am missing something here, I would like to know what it is. Of 
course, I might have completely overlooked something, so I am always 
open to correction.


--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







smime.p7s
Description: S/MIME Cryptographic Signature


Re: position in cancer informatics

2012-07-20 Thread David Booth
On Fri, 2012-07-20 at 10:22 +0100, Stefan Decker wrote:
 The discussion seem to point to a deeper question: how to enable crowd
 sourcing of the analysis of these kind of data sets? This may involve
 running of analysis code or maybe even manual work.
 What kind of computational infrastructure would we need to enable
 this? And how do we validate and aggregate results?

Unfortunately, in the USA at least, the biggest barriers are not
technical, but social, because: (a) health information privacy laws such
as HIPAA
http://www.hhs.gov/ocr/privacy/ 
make it difficult or impossible to publish the raw data that would be
most useful for research; and (b) researchers do not have the incentive
to publish their data that might allow other researchers to make
discoveries.  

There is a tension between privacy and the usefulness of data for
research, because full de-identification removes information that can be
critical to determining cause and effect, such as dates, times and
locations.  

We need better ways -- both bottom-up, such as http://weconsent.us/, and
top-down, such as legal changes -- to both encourage the availability of
research data and to facilitate appropriate access to it, such as
establishing well-defined tiers of access for different purposes.

We need technical solutions that will help us work through and around
these social barriers.


-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.




Re: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Mike Bergman

Hi Kingsley,

On 7/20/2012 5:30 AM, Kingsley Idehen wrote:

On 7/19/12 9:13 PM, Mike Bergman wrote:

+1


On 7/19/2012 6:37 PM, glenn mcdonald wrote:

Remember, this is the Linked Open Data (LOD) forum. We've long past
the issue of *demand driven* over here, re. Linked Data.


There's a difference between solving an issue and just refusing to
address it any more. Pity there isn't a another forum for generating
actual demand as reliably as this one supplies scorn.







Mike,

What problem are we refusing to address, as exemplified by :
http://semgel.com/ .

If I am missing something here, I would like to know what it is. Of
course, I might have completely overlooked something, so I am always
open to correction.


Sebastian Schaffert's response addressed this well. I also comment in 
various ways on this topic on my blog as frequently as the muse strikes me.


The reason I simply commented by +1 is that I did not want to 
contribute to what are often lengthy polemics as to what the community 
is or believes or purports. As best as I can tell there is no true 
community here, but a diverse set of players with diverse interests and 
perspectives. If you follow my writings closely you know that I see 
linked data as a useful and often desirable technique, but not a means.


In your enthusiasm and cheerleading you as often turn people off as 
inspire them. You too frequently take it upon yourself to speak for the 
community. Semgel is a nice contribution being contributed by a new, 
enthusiastic contributor. I think this is to be applauded, not lectured 
or scolded. Semgel is certainly as much on topic as most of the posts to 
this forum.


There is a reason I and Structured Dynamics no longer comment or 
participate on these forums. If I hear from others correctly, our 
perspective is shared by many.


I will not comment further on this thread.

Thanks, Mike







--
__

Michael K. Bergman
CEO  Structured Dynamics LLC
319.621.5225
skype:michaelkbergman
http://structureddynamics.com
http://mkbergman.com
http://www.linkedin.com/in/mkbergman
__



[Additional Extension] Semantic Web for the Legal Domain

2012-07-20 Thread Silvio Peroni
*** Many apologies for cross-posting ***

SEMANTIC WEB JOURNAL
SPECIAL ISSUE ON

Semantic Web for the legal domain

http://www.semantic-web-journal.net/blog/semantic-web-journal-special-issue-semantic-web-legal-domain


*** New Extended Submission Deadline: 30-Aug-2012 ***


CALL FOR PAPER

The legal domain is an ideal field of study for Semantic Web researchers, as it 
uses and contributes to most of the topics that are relevant to the community. 
Additionally, given the complex interactions of legal actors, legal sources and 
legal processes, as well as the relevance and potential impact of successful 
ideas in the political, juridical and social processes of a country, it 
provides a challenging context and an important opportunity for groundbreaking 
research results.

Ontologies, knowledge extraction and reasoning techniques have been studied by 
the AILAW community for years but its results have generated few and sparse 
connections with the Semantic Web community. Thus, the aim of this Special 
Issue is to look to the legal domain from a Semantic Web perspective, in order 
to promote the use of legal knowledge for addressing Semantic Web research 
questions and, vice versa, to use Semantic Web technologies as tools for 
reasoning over legal texts.

In particular, we are looking for high-level contributions exploring and 
investigating on (but not limited to) the following topics:
- Modelling access policies to Semantic Web datasets
- Semantic Web and online dispute resolution and mediation
- Law and Regulations patterns of Social Web communities (such as Second Life, 
Facebook, or Twitter)
- Semantic sensor networks in lawsuits, crisis mapping, emergencies and 
stand-by forces
- Semantic Web techniques and e-discovery in large legal document collections
- Semantic Web technologies and opinion collection and analysis
- Legal content and knowledge in the Linked Data 
- Knowledge acquisition and concept representation on annotations and legal 
texts 
- Legal reasoning and query in the Semantic Web
- Text and legal interpretation in legal semantics
- Scalability issues in representing law and legal texts 
- Analysis of provenance information to detect violations of norms/policies 
- Legal knowledge in trust models
- Expressive vs. lightweight representations of legal content
- Core and domain ontologies in the legal domain
- Theories, design patterns and ontologies in legal argumentation
- Time and legal content representation (texts, concepts, norms)
- OWL approaches to reasoning and  legal knowledge
- Linking legal content to external resources
- Provenance, trust and metadata for authoritative sources
- SPARQL queries on large legal  datasets
- Legal knowledge extraction using NLP and ontologies
- User-friendly applications and interface design to interact with legal 
semantic information
- Publishing/reusing legal-related content in Linked Data
- Legal semantic services and mobile applications
- Rules and Automated Reasoning in the Semantic Web



SUBMISSION INSTRUCTIONS

Prospective authors must take notice of the submission guidelines posted at 
http://www.semantic-web-journal.net/authors. Submissions should be uploaded 
using the regular submission mechanism of the SW journal (see the journal 
website) - mentioning in the cover letter that the submission is for the legal 
domain special issue.



DEADLINES

Submission deadline: 30-Aug-2012
Reviews due: 6-Oct-2012
Notifications: 14-Oct-2012
Second submission: 15-Nov-2012
Second review due: 15-Dec-2012
Second notifications: 22-Dec-2012
Camera ready: 30-Jan-2013




GUEST EDITORS

Pompeu Casanovas (Universitat Autònoma de Barcelona + Victoria University)
Monica Palmirani (University of Bologna)
Silvio Peroni (University of Bologna)
Fabio Vitali (University of Bologna)

E-mail: guest-editors-semantic-web-for-the-legal-dom...@googlegroups.com



EDITORIAL BOARD

Gioele Barabucci (University of Bologna)
Nick Bassiliades (Aristotle University of Thessaloniki) 
Eva Blomqvist (Jönköping University)
Guido Boella (University of Turin)
Alexander Boer (University of Amsterdam)
Tom Bruce (University of Cornell)
Nuria Casellas (Universitat Autònoma de Barcelona)
Paolo Ciccarese (Harvard University)
Oscar Corcho (Universidad Politécnica de Madrid)
John Davies (British Telecomm)
Stefan Dietze (Leibniz University)
Dieter Fensel (University of Innsbruck)
Miriam Fernandez (Open University)
Meritxell Fernandez-Barrera (Université Paris 2)
Enrico Francesconi (Italian National Research Council)
Aldo Gangemi (Italian National Research Council)
Tom Gordon (Fraunhofer Society)
Guido Governatori (National ICT Australia)
Jorge Gracia (Universidad Politécnica de Madrid)
Marco Grobelnik (Jozef Stefan Institute)
Rinke Hoekstra (University of Amsterdam)
Simonetta Montemagni (Italian National Research Council)
Enrico Motta (Open University)
Pablo Noriega (Spanish National Research Council)
Adrian Paschke (Freie Universität Berlin)
Enric Plaza (Spanish National Research Council)

Re: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Kingsley Idehen

On 7/20/12 4:06 AM, Sebastian Schaffert wrote:

Am 19.07.2012 um 20:50 schrieb Kingsley Idehen:


I completely understand and appreciate your desire (which I share) to see a 
mature landscape with a range of linked data sources. I can also understand how 
a database or spreadsheet can potentially offer fine-grained data access - your 
examples do illustrate the point very well indeed!

However, if we want to build a sustainable business, the decision to build 
these features needs to be demand driven.

I disagree.
Note, I responded because I assumed this was a new Linked Data service. But it 
clearly isn't. Thus, I don't want to open up a debate about Linked Data virtues 
if you incorrectly assume they should be *demand driven*.

Remember, this is the Linked Open Data (LOD) forum. We've long past the issue 
of *demand driven* over here, re. Linked Data.

But I agree. A technology that is not able to fire proof its usefulness in a 
demand driven / problem driven environment is maybe interesting from an 
academic standpoint but otherwise not really useful.


So are you claiming that Linked Data hasn't fire proofed its usefulness 
in a demand drive / problem driven environment?



And if you look at the recent troubles with Semantic Web business models you 
see the consequences.


Please clarify what you mean as that statement is quite unclear. What 
recent troubles are you speaking  (so definitively) about re., the 
business model scalability and viability of Linked Data and/or the 
broader Semantic Web vision?




You are not the only one in the community, so please don't say we've passed the 
issue.


Of course I am not the only one in the community. But, I think you are 
missing a critical point: this forum/list/community is about Linked 
Data. Thus, I would expect product announcements to be related to Linked 
Data, at the very least. What's really confusing to me, right now, is 
the fact that I simply sought an actual Linked Data connection from 
Hatish (assuming there had to be one somewhere), received push-back 
about demand and a string of replies that are responding something 
else inferred from my response .




I'd say we have not even really started with the issue, we've just pushed some 
technology out there, not knowing yet whether it is really useful.


I disagree, and here are some very basic examples of proof that the 
utility (usefulness) and demand (need) for Linked Data are yesterday's 
topic:


1. Facebook -- every data object in this data space has a Linked Data 
URI, and by that I mean all 850 million+ profile alongside other data 
objects that represent other aspects of Faceook profiles


2. Various Govts. worldwide -- lead by US and UK govt efforts enhancing 
Open Data by adhering the principles espoused in TimBL's Linked Data meme


3. Rest of the LOD cloud which now tops 55+ billion triples and growing 
every second.




  On the other hand Harish is giving us one example of where at least part of 
the technology *might* be useful and I appreciate this very much. In general, I 
also prefer acting over talking. ;-)


Useful, of course. But useful in a manner that has relevance to Linked 
Data is what I sought from my questions. There is no Linked Data in that 
solution, and all wanted to do was foster dialog that would encourage 
production of Linked Data as others have already done -- for years -- 
re. data from Crunchbase.


My response included examples of what's been achieved with Cruncbase 
data for a very long time, so I hoped he would see the virtues in doing 
something similar such that in classic Linked Data fashion you end up 
with a richer Web of Linked Data.





Considering comments like yours, I really fear for the community to loose its 
openness and acceptance of differing opinions.


What is the differing opinion?


  I had already given up really following the discussions here for exactly that 
reason (and I am not the only one), but this message appeared on my phone 
before the mail client could sort it away and simply made me upset.


Sorry for upsetting you, and I hope you become less upset when you 
understand my point. A simple route to that destination starts by you 
responding to my questions.


I strongly believe you've misunderstood my response, as measured as it 
was, initially. Thus, let's reconcile all of this, and I am quite 
confident that my fundamental point will be resurrected and then clearly 
understood.




Greetings,

Sebastian



--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







smime.p7s
Description: S/MIME Cryptographic Signature


Re: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Kingsley Idehen

On 7/20/12 10:17 AM, Mike Bergman wrote:

Hi Kingsley,

On 7/20/2012 5:30 AM, Kingsley Idehen wrote:

On 7/19/12 9:13 PM, Mike Bergman wrote:

+1


On 7/19/2012 6:37 PM, glenn mcdonald wrote:
Remember, this is the Linked Open Data (LOD) forum. We've long 
past

the issue of *demand driven* over here, re. Linked Data.


There's a difference between solving an issue and just refusing to
address it any more. Pity there isn't a another forum for generating
actual demand as reliably as this one supplies scorn.







Mike,

What problem are we refusing to address, as exemplified by :
http://semgel.com/ .

If I am missing something here, I would like to know what it is. Of
course, I might have completely overlooked something, so I am always
open to correction.


Sebastian Schaffert's response addressed this well. I also comment in 
various ways on this topic on my blog as frequently as the muse 
strikes me.


The reason I simply commented by +1 is that I did not want to 
contribute to what are often lengthy polemics as to what the 
community is or believes or purports. As best as I can tell there is 
no true community here, but a diverse set of players with diverse 
interests and perspectives. If you follow my writings closely you know 
that I see linked data as a useful and often desirable technique, but 
not a means.


In your enthusiasm and cheerleading you as often turn people off as 
inspire them. You too frequently take it upon yourself to speak for 
the community. Semgel is a nice contribution being contributed by a 
new, enthusiastic contributor. I think this is to be applauded, not 
lectured or scolded. Semgel is certainly as much on topic as most of 
the posts to this forum.


There is a reason I and Structured Dynamics no longer comment or 
participate on these forums. If I hear from others correctly, our 
perspective is shared by many.


I will not comment further on this thread.

Thanks, Mike


Mike,

To cut a long story short, you are disagreeing with my attempt to seek 
Linked Data relevance from a solution announced on this mailing list? 
All I sought was the Linked Data dimension that I assumed this solution 
possessed. That's it.




--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







smime.p7s
Description: S/MIME Cryptographic Signature


Re: position in cancer informatics

2012-07-20 Thread Gannon Dick
Sorry Dr. Booth, but I can't accept this conclusion.

We need technical solutions that will help us work through and around these 
social barriers.

The root problem with Linked Data is deeply embedded and I would argue 
inseparable from the current methods of the Mobile Web.  The ID provided by a 
Mobile Device is not the ID best used for Linked Data to prosper.
A little thought experiment:  Say you had a telescope with a single 
magnification and aimed it at a forest nearby.  At some optimum magnification 
you could identify individual trees (Linked Data) and at some much higher 
magnification you would see tree bark (uncoupled microdata) everywhere you 
looked.  Linked Data only works when the observed body (tree) is at rest.  
Newton's First Law and the Heisenberg Uncertainty Principle vanish only to 
return with a vengeance when initial conditions are modified (the body moves OR 
magnification is increased step-wise).

For socio-economic data including Health Data, the implication is that a Mobile 
Device is suitable to distribute data but cannot be used to infer metadata in a 
current experiment.  Collecting data in real time is problematic too.  I 
believe that if Mobile Device Technology is the constant and Linked Data ID's 
are flexible we will never get to where we want to go.  Technical solutions 
for research data are not impeded by social barriers but rather by the tools 
(The Mobile Web) which exaggerate the commercial value of Personally 
Identifiable Information and over-magnify the data rendering making said data 
useless for research.  Smart Home Appliances are not the answer either, but 
I've talked too much already.




 From: David Booth da...@dbooth.org
To: Stefan Decker stefan.dec...@deri.org 
Cc: Helena Deus helena.d...@deri.org; Melvin Carvalho 
melvincarva...@gmail.com; nat...@webr3.org nat...@webr3.org; Michael 
Hausenblas michael.hausenb...@deri.org; semantic-...@w3.org 
semantic-...@w3.org; public-lod@w3.org public-lod@w3.org; 
www-rdf-inter...@w3.org www-rdf-inter...@w3.org; 
protege-discuss...@lists.stanford.edu 
protege-discuss...@lists.stanford.edu; semantic...@yahoogroups.com 
semantic...@yahoogroups.com; dbwo...@cs.wisc.edu dbwo...@cs.wisc.edu; 
machine-learn...@egroups.com machine-learn...@egroups.com; 
taverna-us...@lists.sourceforge.net taverna-us...@lists.sourceforge.net; 
b...@bioinformatics.org b...@bioinformatics.org 
Sent: Friday, July 20, 2012 9:00 AM
Subject: Re: position in cancer informatics
 
On Fri, 2012-07-20 at 10:22 +0100, Stefan Decker wrote:
 The discussion seem to point to a deeper question: how to enable crowd
 sourcing of the analysis of these kind of data sets? This may involve
 running of analysis code or maybe even manual work.
 What kind of computational infrastructure would we need to enable
 this? And how do we validate and aggregate results?

Unfortunately, in the USA at least, the biggest barriers are not
technical, but social, because: (a) health information privacy laws such
as HIPAA
http://www.hhs.gov/ocr/privacy/ 
make it difficult or impossible to publish the raw data that would be
most useful for research; and (b) researchers do not have the incentive
to publish their data that might allow other researchers to make
discoveries.  

There is a tension between privacy and the usefulness of data for
research, because full de-identification removes information that can be
critical to determining cause and effect, such as dates, times and
locations.  

We need better ways -- both bottom-up, such as http://weconsent.us/, and
top-down, such as legal changes -- to both encourage the availability of
research data and to facilitate appropriate access to it, such as
establishing well-defined tiers of access for different purposes.

We need technical solutions that will help us work through and around
these social barriers.


-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.

Re: position in cancer informatics

2012-07-20 Thread Paola Di Maio
 We need technical solutions that will help us work through and around
 these social barriers.


Suggested rephrase perhaps:

we need the *socio-technical systems* that will help us work through
and around ...

  etc etc



PDM
ISTCS.org
socio-technical systems research


 --
 David Booth, Ph.D.
 http://dbooth.org/

 Opinions expressed herein are those of the author and do not necessarily
 reflect those of his employer.





Re: position in cancer informatics

2012-07-20 Thread Peter Jones
On the sociotechnical of possible interest:

http://hodges-model.blogspot.co.uk/search/label/sociotechnical

 
Re. cancer informatics I'm enrolled on four BOINC projects - not one is cancer 
specific, but a couple can no doubt be related and the public can join in at 
least indirectly crunching data.

This is an area where citizen science could also make a real difference.
 

Peter Jones, Lancashire, UK
Blogging at Welcome to the QUAD
http://hodges-model.blogspot.com/
h2cm: help 2C more - help 2 listen - help 2 care
http://twitter.com/#!/h2cm



 From: Paola Di Maio paola.dim...@gmail.com
To: David Booth da...@dbooth.org 
Cc: Stefan Decker stefan.dec...@deri.org; Helena Deus helena.d...@deri.org; 
Melvin Carvalho melvincarva...@gmail.com; nat...@webr3.org 
nat...@webr3.org; Michael Hausenblas michael.hausenb...@deri.org; 
semantic-...@w3.org semantic-...@w3.org; public-lod@w3.org 
public-lod@w3.org; www-rdf-inter...@w3.org www-rdf-inter...@w3.org; 
protege-discuss...@lists.stanford.edu 
protege-discuss...@lists.stanford.edu; semantic...@yahoogroups.com 
semantic...@yahoogroups.com; dbwo...@cs.wisc.edu dbwo...@cs.wisc.edu; 
machine-learn...@egroups.com machine-learn...@egroups.com; 
taverna-us...@lists.sourceforge.net taverna-us...@lists.sourceforge.net; 
b...@bioinformatics.org b...@bioinformatics.org 
Sent: Friday, 20 July 2012, 18:29
Subject: Re: position in cancer informatics
 
 We need technical solutions that will help us work through and around
 these social barriers.


Suggested rephrase perhaps:

we need the *socio-technical systems* that will help us work through
and around ...

  etc etc



PDM
ISTCS.org
socio-technical systems research


 --
 David Booth, Ph.D.
 http://dbooth.org/

 Opinions expressed herein are those of the author and do not necessarily
 reflect those of his employer.



Re: Linked Data Demand Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Kingsley Idehen

On 7/20/12 11:48 AM, Sebastian Schaffert wrote:

[SNIP] -- so that we can focus on the key non personal points.

My claim is founded in the many discussions I have when going to the CTOs 
of*real*  companies (big ones, outside the research business) out there and 
trying to convince them that they should build on Semantic Web technologies 
(because I believe they are superior). Believe me, even though I strongly 
believe in the technology, this is a very tough job without a good reference 
example that convinces them they will save X millions of Euros or improve the 
life or their employees or the society in the short- to medium term.


Why do you assume that others (like myself) that don't share your views, 
don't talk to CTOs?  BTW -  there are a number of companies that 
actually have paying customers using Linked Data effectively; these  
companies may not necessarily believe in announcing every customer 
closure related to Linked Data.




Random sample answer from this week (I could bring many): So this Linked Data is a 
possibility for data integration. Tell me, why should I convince my engineers to throw 
away their proven integration solutions? Why is Linked Data so superior to existing 
solutions? Where is it already in enterprise use?.


I don't know how you've concluded that Linked Data is a rip and 
replace approach to technology adoption. Its quite the contrary.
Linked Data's most powerful virtue is its ability to enhance what 
already exists re:


1. data object identity
2. data object representation
3. data object access
4. data object serialization
5  data object access control lists and policies.

Please read some of the older threads on this mailing list. Do you think 
Facebook publishes Linked Data for no good reason? Ditto the U.S. and UK 
governments amongst many other contributors to the LOD cloud? Likewise 
any other enterprise that's already effectively using Linked Data as a 
conceptual model oriented virutalization atop disparate data sources etc?




The big datasets always sold as a success story in the Linked Data Cloud are 
largely irrelevant to businesses:
- they are mostly dealing with internal data (projects, people, CRM, ERP, 
documents, CMS, …) where you won't find information in the LD cloud anyways


There is a difference between the Linked Open Data (LOD) Cloud and 
Linked Data. There's also a subtle difference between Linked Open Data 
and the LOD Cloud.


Linked Open Data is about standards based structured data representation 
and access, based on a specific use of de-referencable URIs to augment 
said data representation and access.


LOD Cloud is about publicly accessible application of the above, with 
contributions from a plethora of sources, across a variety of subject 
matter domains.



- they do not trust just some data from the Internet to build multi-million 
business decisions on top


See my comment above. That isn't what I am talking about.

- they find the data in the cloud too messy (as an example: try finding country 
codes on DBPedia …) and too unreliable (most servers do not respond in 
sufficient time)


Ditto, not my point. The LOD cloud is a distributed lookup table and 
that's about it.


Mike has actually assembled some very nice blog posts on related topics:
-http://www.mkbergman.com/917/practical-p-p-p-problems-with-linked-data/
-http://www.mkbergman.com/859/seven-pillars-of-the-open-semantic-enterprise/


I am no stranger to Mike. Sometimes it helps if you do a few lookups to 
provide context for your responses.







And if you look at the recent troubles with Semantic Web business models you 
see the consequences.


Please clarify what you mean as that statement is quite unclear. What recent 
troubles are you speaking  (so definitively) about re., the business model scalability 
and viability of Linked Data and/or the broader Semantic Web vision?

I was referring to the recent bankruptcy of Ontoprise and the fact that Talis is reducing 
its Linked Data involvement, essentially shutting down their we help you publish 
Linked Data service. I thought you might have guessed.


Why should I guess. You over simplify those items and I am not in the 
business of speaking about other companies. Talking about markets, 
technologies, and business models are fine for me, but It stops right 
there.








You are not the only one in the community, so please don't say we've passed the 
issue.


Of course I am not the only one in the community. But, I think you are missing a 
critical point: this forum/list/community is about Linked Data. Thus, I would expect product 
announcements to be related to Linked Data, at the very least. What's really confusing to 
me, right now, is the fact that I simply sought an actual Linked Data connection from Hatish 
(assuming there had to be one somewhere), received push-back about demand and a 
string of replies that are responding something else inferred from my response .

The problem is that in most of your 

System for Crowdsourcing Data Analysis [Was: position in cancer informatics]

2012-07-20 Thread Adrian Walker
Hi All,

Stefan Decker wrote:
 The discussion seem to point to a deeper question: how to enable crowd
 sourcing of the analysis of these kind of data sets? This may involve
 running of analysis code or maybe even manual work.
 What kind of computational infrastructure would we need to enable
 this? And how do we validate and aggregate results?

There is a system online [1] for crowdsourcing data analysis knowledge in
Executable English , with examples, such as [2]. The knowledge is used to
answer questions over web databases, with English explanations of the
results for validation.   In some cases, the explanations can be used as
plans.

[3] is a short overview paper, and besides the live system [1], there are
several presentations, movies etc on the site.

Apologies if you have seen this before, and thanks for comments.

-- Adrian

[1]  Internet Business Logic
A Wiki and SOA Endpoint for Executable Open Vocabulary English Q/A over SQL
and RDF
Online at www.reengineeringllc.com
Shared use is free, and there are no advertisements

[2]  www.reengineeringllc.com/demo_agents/MedMine2.agent

[3]
www.reengineeringllc.com/A_Wiki_for_Business_Rules_in_Open_Vocabulary_Executable_English.pdf

On Fri, Jul 20, 2012 at 10:00 AM, David Booth david@dbo da...@dbooth.org

oth.org da...@dbooth.org wrote:

 On Fri, 2012-07-20 at 10:22 +0100, Stefan Decker wrote:
  The discussion seem to point to a deeper question: how to enable crowd
  sourcing of the analysis of these kind of data sets? This may involve
  running of analysis code or maybe even manual work.
  What kind of computational infrastructure would we need to enable
  this? And how do we validate and aggregate results?

 Unfortunately, in the USA at least, the biggest barriers are not
 technical, but social, because: (a) health information privacy laws such
 as HIPAA
 http://www.hhs.gov/ocr/privacy/
 make it difficult or impossible to publish the raw data that would be
 most useful for research; and (b) researchers do not have the incentive
 to publish their data that might allow other researchers to make
 discoveries.

 There is a tension between privacy and the usefulness of data for
 research, because full de-identification removes information that can be
 critical to determining cause and effect, such as dates, times and
 locations.

 We need better ways -- both bottom-up, such as http://weconsent.us/, and
 top-down, such as legal changes -- to both encourage the availability of
 research data and to facilitate appropriate access to it, such as
 establishing well-defined tiers of access for different purposes.

 We need technical solutions that will help us work through and around
 these social barriers.


 --
 David Booth, Ph.D.
 http://dbooth.org/

 Opinions expressed herein are those of the author and do not necessarily
 reflect those of his employer.





Re: Linked Data Demand Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Sebastian Schaffert
Kingsley,

Am 20.07.2012 um 20:20 schrieb Kingsley Idehen:

 Again, how have you arrived at the Linked Data vs CSV scenario? Secondly, if 
 you'd done some background lookup, you would have stumbled across comments 
 I've made about CSV and Linked Data. 

This is exactly the kind of comment by which you prove my point (regarding the 
discussion culture). I refrain from any further discussion on the topic until 
you stop assuming everyone else is stupid when he does not agree on your 
points, like I already announced in private discussion.

Have a nice weekend.

Sebastian
-- 
| Dr. Sebastian Schaffert  sebastian.schaff...@salzburgresearch.at
| Salzburg Research Forschungsgesellschaft  http://www.salzburgresearch.at
| Head of Knowledge and Media Technologies Group  +43 662 2288 423
| Jakob-Haringer Strasse 5/II
| A-5020 Salzburg



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Linked Data Demand Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Kingsley Idehen

On 7/20/12 4:05 PM, Sebastian Schaffert wrote:

Kingsley,

Am 20.07.2012 um 20:20 schrieb Kingsley Idehen:


Again, how have you arrived at the Linked Data vs CSV scenario? Secondly, if 
you'd done some background lookup, you would have stumbled across comments I've 
made about CSV and Linked Data.

This is exactly the kind of comment by which you prove my point (regarding the 
discussion culture). I refrain from any further discussion on the topic until 
you stop assuming everyone else is stupid when he does not agree on your 
points, like I already announced in private discussion.

Have a nice weekend.

Sebastian
You are not going to get away with misrepresenting me in public. It 
won't happen.


Here is what you posted en route to my response:

Where is the convincing business application? Since most of the data is 
statistics anyways, where is Linked Data superior to say CSV?


To clarify my response:

What in my thread or past commentary would lead you to asking me, or 
anyone else for that matter such a question?


Links (via simple Google search) :

1. http://ontolog.cim3.net/forum/ontolog-forum/2010-10/msg00263.html -- 
ontolog forum post that leads to discussion about CSV and Linked Data


2. http://bit.ly/QhGBXY -- explaining how CSV output from SPARQL 
endpoints delivers powerful hooks into Google Spreadsheet


3. http://bit.ly/NP8uWv -- ditto for Microsoft Excel.


--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







smime.p7s
Description: S/MIME Cryptographic Signature


Re: Linked Data Demand Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Dave Reynolds

Hi Sebastian,

I completely agree with what you say about:
  o Harish's original post being relevant to linked data and this list
  o that the culture of this forum can be counter productive
  o that the evidence for linked data delivering business value needs
to be a lot stronger

However, just to balance the picture slightly ...

There are *some* clear, well documented examples of semweb/RDF/LD 
delivering business value through data integration. The most famous of 
these being probably: Garlik (now Experian), Amdocs and arguably the 
BBC. In my experience for every publicised example there are several 
non-public or at least less visible examples of companies quietly using 
the technology internally while not shouting about it. I've come across 
examples in banking, publishing, travel and health care - at different 
levels of maturity.


Not saying the business value story is perfectly articulated or the 
evidence is watertight, but it's not totally absent :)


While it's not your main point, I would also say we have reasonable 
arguments for the value of linked data over just CSVs for publishing 
government statistics and measurement data. The benefits include safer 
use of data because it's self-describing (e.g. units!), ability to slice 
and dice through API calls making it easier to build apps, ability to 
address the data and thus annotate it and reference it. The more 
advanced government departments approach this as publish once, use 
many. One pipeline that lets people access the data as dumps, through 
REST APIs, as Linked Data or via apps - all powered by a shared Linked 
Data infra-structure. It's not CSV or Linked Data it's CSV *and* Linked 
Data.


Dave

On 20/07/12 16:48, Sebastian Schaffert wrote:

Kingsley,

I am trying to respond to your factual arguments inline. But let me first point out that the 
central problem for me is exactly what Mike pointed out: In your enthusiasm and cheerleading 
you as often turn people off as inspire them. You too frequently take it upon yourself to 
speak for the community. Semgel is a nice contribution being contributed by a new, 
enthusiastic contributor. I think this is to be applauded, not lectured or scolded. Semgel is 
certainly as much on topic as most of the posts to this forum.

The message you should hear is that many people are frustrated by the way the 
discussions in this forum are carried out and have already stopped contributing 
or even reading. And this is a very bad development for a community. The topic 
we are discussing right now is only a symptom. Please think about it.

Am 20.07.2012 um 16:43 schrieb Kingsley Idehen:


On 7/20/12 4:06 AM, Sebastian Schaffert wrote:

Am 19.07.2012 um 20:50 schrieb Kingsley Idehen:


I completely understand and appreciate your desire (which I share) to see a 
mature landscape with a range of linked data sources. I can also understand how 
a database or spreadsheet can potentially offer fine-grained data access - your 
examples do illustrate the point very well indeed!

However, if we want to build a sustainable business, the decision to build 
these features needs to be demand driven.

I disagree.
Note, I responded because I assumed this was a new Linked Data service. But it 
clearly isn't. Thus, I don't want to open up a debate about Linked Data virtues 
if you incorrectly assume they should be *demand driven*.

Remember, this is the Linked Open Data (LOD) forum. We've long past the issue 
of *demand driven* over here, re. Linked Data.

But I agree. A technology that is not able to fire proof its usefulness in a 
demand driven / problem driven environment is maybe interesting from an 
academic standpoint but otherwise not really useful.


So are you claiming that Linked Data hasn't fire proofed its usefulness in a 
demand drive / problem driven environment?



Indeed. This is my right as much as yours is to claim the opposite.

My claim is founded in the many discussions I have when going to the CTOs of 
*real* companies (big ones, outside the research business) out there and trying 
to convince them that they should build on Semantic Web technologies (because I 
believe they are superior). Believe me, even though I strongly believe in the 
technology, this is a very tough job without a good reference example that 
convinces them they will save X millions of Euros or improve the life or their 
employees or the society in the short- to medium term.

Random sample answer from this week (I could bring many): So this Linked Data is a 
possibility for data integration. Tell me, why should I convince my engineers to throw 
away their proven integration solutions? Why is Linked Data so superior to existing 
solutions? Where is it already in enterprise use?.

The big datasets always sold as a success story in the Linked Data Cloud are 
largely irrelevant to businesses:
- they are mostly dealing with internal data (projects, people, CRM, ERP, 
documents, CMS, …) where you won't find information in the 

Re: Linked Data Demand Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Martynas Jusevičius
Sebastian, all,

I'm on your side here. But regarding Linked Data, consider the
following points that slow down its adoption:
- data-heavy players such as Facebook and Google might not be
interested in adopting a new open, even if superior, data approach,
since it is in their interest to keep as much control over the data as
possible
- in the corporate world, big vendors like Microsoft and Oracle have
created a lock-in, and big companies and organizations are hesitating
to invest in new long-term solutions
- the long term is where Linked Data really shines, because while the
global data interconnectedness increases, it provides linear
integration costs instead of exponential as in the Web 2.0 API-to-API
approach
- RDF and Linked Data are quietly doing their job at research
institutes and innovative organizations like BBC and are not receiving
the marketing dollars thrown at NoSQL solutions such as MongoDB.
However when it comes to production use, NoSQL is no less problematic
than triplestores (I have some experience in the startup world), while
RDF is the only standardized NoSQL/graph data model, which even has a
query language and quite a few tools.
- RDF and Linked Data are taught at very few schools. Even in computer
science studies, web application development is often stuck at
PHP+MySQL level, or Web 2.0 and RESTful APIs at best.

So I would say Linked Data is like electrical vehicles -- most who
understand the technology would find it superior, but there are a lot
of different agendas and interests that not necessarily result in what
is better for the public. And then there is ignorance as well.

When it comes to Linked Data applications, I'm about to release to
open-source code which I hope will make it easier.

Martynas
graphity.org

On Fri, Jul 20, 2012 at 5:48 PM, Sebastian Schaffert
sebastian.schaff...@salzburgresearch.at wrote:
 Kingsley,

 I am trying to respond to your factual arguments inline. But let me first 
 point out that the central problem for me is exactly what Mike pointed out: 
 In your enthusiasm and cheerleading you as often turn people off as inspire 
 them. You too frequently take it upon yourself to speak for the community. 
 Semgel is a nice contribution being contributed by a new, enthusiastic 
 contributor. I think this is to be applauded, not lectured or scolded. Semgel 
 is certainly as much on topic as most of the posts to this forum.

 The message you should hear is that many people are frustrated by the way the 
 discussions in this forum are carried out and have already stopped 
 contributing or even reading. And this is a very bad development for a 
 community. The topic we are discussing right now is only a symptom. Please 
 think about it.

 Am 20.07.2012 um 16:43 schrieb Kingsley Idehen:

 On 7/20/12 4:06 AM, Sebastian Schaffert wrote:
 Am 19.07.2012 um 20:50 schrieb Kingsley Idehen:

 I completely understand and appreciate your desire (which I share) to see 
 a mature landscape with a range of linked data sources. I can also 
 understand how a database or spreadsheet can potentially offer 
 fine-grained data access - your examples do illustrate the point very 
 well indeed!

 However, if we want to build a sustainable business, the decision to 
 build these features needs to be demand driven.
 I disagree.
 Note, I responded because I assumed this was a new Linked Data service. 
 But it clearly isn't. Thus, I don't want to open up a debate about Linked 
 Data virtues if you incorrectly assume they should be *demand driven*.

 Remember, this is the Linked Open Data (LOD) forum. We've long past the 
 issue of *demand driven* over here, re. Linked Data.
 But I agree. A technology that is not able to fire proof its usefulness in 
 a demand driven / problem driven environment is maybe interesting from an 
 academic standpoint but otherwise not really useful.

 So are you claiming that Linked Data hasn't fire proofed its usefulness in a 
 demand drive / problem driven environment?


 Indeed. This is my right as much as yours is to claim the opposite.

 My claim is founded in the many discussions I have when going to the CTOs of 
 *real* companies (big ones, outside the research business) out there and 
 trying to convince them that they should build on Semantic Web technologies 
 (because I believe they are superior). Believe me, even though I strongly 
 believe in the technology, this is a very tough job without a good reference 
 example that convinces them they will save X millions of Euros or improve the 
 life or their employees or the society in the short- to medium term.

 Random sample answer from this week (I could bring many): So this Linked 
 Data is a possibility for data integration. Tell me, why should I convince my 
 engineers to throw away their proven integration solutions? Why is Linked 
 Data so superior to existing solutions? Where is it already in enterprise 
 use?.

 The big datasets always sold as a success story in the Linked Data Cloud are 
 

Re: Linked Data Demand Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Sebastian Schaffert
Dear Martynas,

Thanks for your constructive answer. I completely agree with all your points, 
and I am looking forward to your software (already checked the README ;-) ). We 
will surely try it out (maybe as a client for our Linked Media Framework).

The problem I am facing is that part of my (and my group's) current job is to 
try bringing the technologies we are developing in research into ordinary 
industry. Not the Microsofts, Facebooks or Oracles (who are all highly 
innovative in Web and database technologies), but small and big companies from 
the (traditional) media sector and manufacturing industry who have big IT 
departments and infrastructures and could benefit greatly from Linked Data and 
related technologies. They often still live in the world of CORBA, ERP and file 
systems, and not necessarily in the Web.

With the partners we have we silently follow the Linked Data approach by 
trying to solve their immediate problems and using Linked Data in the 
background. While in the media sector this is quite successful (see e.g. 
http://search.salzburg.com, 1.1 million news articles all as Linked Data but 
the interface is facetted search), it is significantly more difficult 
explaining the advantages to e.g. manufacturing industries. Some typical 
problems I already mentioned in my previous post (lack of trust, lack of 
relevant data, lack of quality). Some others - indirectly related to Linked 
Data:
- they have proven and working infrastructures, and they have experienced IT 
engineers knowing their stuff; why should they adopt a new technology? They 
don't have a Linked Data problem per se
- IT in such companies is typically a central department and not a business 
division; they have only limited resources for technology innovation, why 
invest in Linked Data and not in some other technology where they can say it 
will save us X million Euros? 

Maybe we are targetting the wrong or too difficult sector, true. But I am 
convinced that the technology is useful especially in such settings, so I want 
to prove it by building applications that would not be possible otherwise. 
Unfortunately, I am lacking convincing business cases that shows THEM that the 
technology is superior. Noone needs to convince ME about the virtues of Linked 
Data, or otherwise I would not develop software of publish scientific articles 
related to it. ;-)

If we could collect even a small set of convincing business cases and describe 
what problems they are solving and how, and also what problems they 
encountered, I think it would help many of us.

Greetings,

Sebastian

Am 21.07.2012 um 00:16 schrieb Martynas Jusevičius:

 Sebastian, all,
 
 I'm on your side here. But regarding Linked Data, consider the
 following points that slow down its adoption:
 - data-heavy players such as Facebook and Google might not be
 interested in adopting a new open, even if superior, data approach,
 since it is in their interest to keep as much control over the data as
 possible
 - in the corporate world, big vendors like Microsoft and Oracle have
 created a lock-in, and big companies and organizations are hesitating
 to invest in new long-term solutions
 - the long term is where Linked Data really shines, because while the
 global data interconnectedness increases, it provides linear
 integration costs instead of exponential as in the Web 2.0 API-to-API
 approach
 - RDF and Linked Data are quietly doing their job at research
 institutes and innovative organizations like BBC and are not receiving
 the marketing dollars thrown at NoSQL solutions such as MongoDB.
 However when it comes to production use, NoSQL is no less problematic
 than triplestores (I have some experience in the startup world), while
 RDF is the only standardized NoSQL/graph data model, which even has a
 query language and quite a few tools.
 - RDF and Linked Data are taught at very few schools. Even in computer
 science studies, web application development is often stuck at
 PHP+MySQL level, or Web 2.0 and RESTful APIs at best.
 
 So I would say Linked Data is like electrical vehicles -- most who
 understand the technology would find it superior, but there are a lot
 of different agendas and interests that not necessarily result in what
 is better for the public. And then there is ignorance as well.
 
 When it comes to Linked Data applications, I'm about to release to
 open-source code which I hope will make it easier.
 
 Martynas
 graphity.org
 
 On Fri, Jul 20, 2012 at 5:48 PM, Sebastian Schaffert
 sebastian.schaff...@salzburgresearch.at wrote:
 Kingsley,
 
 I am trying to respond to your factual arguments inline. But let me first 
 point out that the central problem for me is exactly what Mike pointed out: 
 In your enthusiasm and cheerleading you as often turn people off as inspire 
 them. You too frequently take it upon yourself to speak for the community. 
 Semgel is a nice contribution being contributed by a new, enthusiastic 
 contributor. I think this is to be