Yes Ted, thanks...
In Italy we don't have Thanksgiving... so I forgot that you were on
holidays in USA... Excuse me for the disturb!

I worked here and meanwhile I've started to use CLUTO output for a first
cluster label estimation.

Anyway, I'm waiting for your response, to try to complete our system with
sensclusters.
Thank you so much.

Stefano Silvestri,
Università di Napoli "Federico II"


2014-12-01 14:29 GMT+01:00 Ted Pedersen <[email protected]>:

> My apologies for the very delayed response - you caught us right
> before the start of the Thanksgiving Holidays here in the USA - we are
> back to work now, and so I'll take a look at this today.
>
> On Wed, Nov 26, 2014 at 3:19 PM, Stefano Silvestri
> <[email protected]> wrote:
> > Hi Ted,
> > as described in the previous email, I've launched my experiment. As said,
> > the final step of my pipeline is the cluster labeling, using
> Sensclusters.
> > I want to remember to you that the system performs an unsupervised
> relation
> > extraction from the entities found in 988 clinical records (the entities
> > have been extracted through UMLS databases and we cluster the couples of
> > entities).
> >
> > To integrate Sunslusters cluster_label in our system, I've produced a
> > cluto-style output for the clustering results (around 160000 elements)
> and
> > an rlabel file (same number), with the list of all the clustered
> elements.
> > At this point, I have problems in running format_cluster.
> >
> > To perform the labeling, I need the the format_cluster's output,
> generated
> > with the --context option. So, I've created a senseval-2 file with
> > text2sval.pl. The input file of text2sval is a plain text with each
> whole
> > clinical record on each line.
> > Naturally, each context contains more than one cluster members.
> > I haven't used any optional argument in text2sval.
> >
> > This output has 988 instance ids. Now, when I try to launch
> format_cluster,
> > I have the following error, occurring during the parse of the senseval
> file:
> > Use of uninitialized value $sentence in pattern match (m//) at
> > ../.cpan/build/Text-SenseClusters-1.03-FMoSjn/Toolkit/evaluate/
> format_clusters.pl
> > line 309, <SCON> line 5938. (when it reaches the last line of senseval2
> > file).
> >
> > I'm thinking that the context used are wrong... so my question are:
> > 1) do I have to put in the context only the extracted entities or the
> > relations?
> > 2) Do the contexts must be in the same number of clustered elements?
> > 3) If nothing is (theoretically) wrong, what should be the error in the
> > sense-eval file?
> >
> > I'm waiting for your response...
> > Thank you for the attention and I hope that you can help us to complete
> our
> > research.
> >
> >
> > 2014-10-23 16:02 GMT+02:00 Stefano Silvestri <
> [email protected]>:
> >>
> >> Hi Ted and thanks.
> >>
> >> The PoS tagging, entity recognition, feature extraction and the
> clustering
> >> tasks have been created with our system (not Senseclusters) - still in
> >> developement.
> >> Now I'm trying to use the cluster_labeling module of SenseClusters to
> show
> >> that we have found, in a unsupervised approach, the relation between
> medical
> >> entities in the clinical records (i.e. diabetes mellitus <> glycemia)
> and
> >> have, in this way, some labels for the clusters.
> >>
> >> I'm now writing the code to create the context files and then I'll run
> the
> >> experiments on cluster labeling. I'll let you know in a few days if
> >> everything worked well and, in case of a new publication, I'll cite your
> >> great work.
> >>
> >> I'm sure that I will ask some more things in the next days, so I thank
> you
> >> in advance.
> >> Stefano Silvestri
> >>
> >>
> >> 2014-10-23 15:07 GMT+02:00 Ted Pedersen <[email protected]>:
> >>>
> >>> Hi Stefano,
> >>>
> >>> This sounds like an interesting project, and it's good to know
> >>> SenseClusters is proving to be useful. See my responses inline...
> >>>
> >>> On Wed, Oct 22, 2014 at 5:58 AM, Stefano Silvestri
> >>> <[email protected]> wrote:
> >>> > I've used a clustering techniques to discover, in an unsupervised
> way,
> >>> > relations between medical entities contained in a large collection of
> >>> > anonymized medical records, in a reserch project of University of
> >>> > Neaples.
> >>> > The data set is composed by a large set of features - all the results
> >>> > will
> >>> > be shortly published on a journal.
> >>> >
> >>> > The next step in the development of our system is performing an
> >>> > unsupervised
> >>> > cluster (relation) labeling. To do that, I think to try the
> >>> > clusterlabeling
> >>> > module from Senseclusters. For creating the input to clusterlabeling
> I
> >>> > have
> >>> > to use format_clusters module with --context option and now I have
> some
> >>> > problems.
> >>> >
> >>> > I have already produced a cluto-style cluster solution file (no
> problem
> >>> > for
> >>> > that) from my system.
> >>> >
> >>> > The rlabel file, if I'm right, is a file containing the explicit
> >>> > corresponding name of each entity in the cluster (in my case the
> >>> > relation).
> >>> > Is that right?
> >>>
> >>> Yes, rlabel shows the cluster to which each instance has been assigned.
> >>>
> >>> >
> >>> > And now the problems about the context file...
> >>> > It should be in senseval2 format. My experimental assesment is made
> of
> >>> > a
> >>> > plain text files - so I should use plain text to headless senseval2
> >>> > utility.
> >>> >
> >>> > I have some questions.
> >>> >
> >>> > 1) Does the context file have to put together all my input files (the
> >>> > medical records) in one large file (and each context must correspond
> to
> >>> > a
> >>> > medical record)?
> >>>
> >>> Yes, the input for each run of SenseClusters should be a single file
> >>> with all your contexts included.
> >>>
> >>> >
> >>> > 2) Does the contexts be headless, or I have to tag (<head></head>)
> all
> >>> > the
> >>> > entities (medical names) in input?
> >>>
> >>> Your contexts can be headless, and so there is no need to include
> >>> <head> tags in your contexts.
> >>>
> >>> >
> >>> > 3) Are other costrains in the context files (formatting, tags, or
> >>> > other)?
> >>> >
> >>>
> >>> There shouldn't be. The output from text2sval.pl should be acceptable
> >>> for input "as is".
> >>>
> >>> > In case of success of the experiments, of course, I'll credit and
> cite
> >>> > the
> >>> > Senseclusters project.
> >>> >
> >>> > PS - my system works on italian language.
> >>>
> >>> That's great! We'd be happy to answer further questions as they arise,
> >>> and will be curious to know how things work out!
> >>>
> >>> Good luck,
> >>> Ted
> >>>
> >>> >
> >>> > Thanks for response,
> >>> > Stefano Silvestri,
> >>> > NLP researcher at University of Neaples "Federico II"
> >>> >
> >>> >
> >>> >
> ------------------------------------------------------------------------------
> >>> > Comprehensive Server Monitoring with Site24x7.
> >>> > Monitor 10 servers for $9/Month.
> >>> > Get alerted through email, SMS, voice calls or mobile push
> >>> > notifications.
> >>> > Take corrective actions from your mobile device.
> >>> > http://p.sf.net/sfu/Zoho
> >>> > _______________________________________________
> >>> > senseclusters-users mailing list
> >>> > [email protected]
> >>> > https://lists.sourceforge.net/lists/listinfo/senseclusters-users
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Ted Pedersen
> >>> http://www.d.umn.edu/~tpederse
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> _______________________________________________
> >>> senseclusters-users mailing list
> >>> [email protected]
> >>> https://lists.sourceforge.net/lists/listinfo/senseclusters-users
> >>
> >>
> >
> >
> >
> ------------------------------------------------------------------------------
> > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> > from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> > with Interactivity, Sharing, Native Excel Exports, App Integration & more
> > Get technology previously reserved for billion-dollar corporations, FREE
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> > _______________________________________________
> > senseclusters-users mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/senseclusters-users
> >
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> _______________________________________________
> senseclusters-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/senseclusters-users
>
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to