Hi Stefano,

This sounds like an interesting project, and it's good to know
SenseClusters is proving to be useful. See my responses inline...

On Wed, Oct 22, 2014 at 5:58 AM, Stefano Silvestri
<[email protected]> wrote:
> I've used a clustering techniques to discover, in an unsupervised way,
> relations between medical entities contained in a large collection of
> anonymized medical records, in a reserch project of University of Neaples.
> The data set is composed by a large set of features - all the results will
> be shortly published on a journal.
>
> The next step in the development of our system is performing an unsupervised
> cluster (relation) labeling. To do that, I think to try the clusterlabeling
> module from Senseclusters. For creating the input to clusterlabeling I have
> to use format_clusters module with --context option and now I have some
> problems.
>
> I have already produced a cluto-style cluster solution file (no problem for
> that) from my system.
>
> The rlabel file, if I'm right, is a file containing the explicit
> corresponding name of each entity in the cluster (in my case the relation).
> Is that right?

Yes, rlabel shows the cluster to which each instance has been assigned.

>
> And now the problems about the context file...
> It should be in senseval2 format. My experimental assesment is made of a
> plain text files - so I should use plain text to headless senseval2 utility.
>
> I have some questions.
>
> 1) Does the context file have to put together all my input files (the
> medical records) in one large file (and each context must correspond to a
> medical record)?

Yes, the input for each run of SenseClusters should be a single file
with all your contexts included.

>
> 2) Does the contexts be headless, or I have to tag (<head></head>) all the
> entities (medical names) in input?

Your contexts can be headless, and so there is no need to include
<head> tags in your contexts.

>
> 3) Are other costrains in the context files (formatting, tags, or other)?
>

There shouldn't be. The output from text2sval.pl should be acceptable
for input "as is".

> In case of success of the experiments, of course, I'll credit and cite the
> Senseclusters project.
>
> PS - my system works on italian language.

That's great! We'd be happy to answer further questions as they arise,
and will be curious to know how things work out!

Good luck,
Ted

>
> Thanks for response,
> Stefano Silvestri,
> NLP researcher at University of Neaples "Federico II"
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> senseclusters-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/senseclusters-users
>



-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse

------------------------------------------------------------------------------
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to