[jira] [Commented] (OPENNLP-1226) dd.mm.yyyy Date format in txt file for .bin model training

Simon poortman (Jira) Tue, 19 Nov 2019 17:14:20 -0800


    [ 
https://issues.apache.org/jira/browse/OPENNLP-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977964#comment-16977964
 ]


Simon poortman commented on OPENNLP-1226:
-----------------------------------------

[root@cloudpoc3 ~]# more /etc/aurora/clusters.json
[

{ "auth_mechanism": "UNAUTHENTICATED", "name": "devcluster", 
"scheduler_zk_path": "/aurora/scheduler", "slave_root": "/var/lib/mesos", 
"slave_run_directory": "latest", "zk": "127.0.0.1" }

]

+_*The below is hello_world.aurora [file:*_+|file:///*_+]

pkg_path = '/opt/aurora_test/hello_world.py'
 # we use a trick here to make the configuration change with
 # the contents of the file, for simplicity. in a normal setting, packages 
would be
 # versioned, and the version number would be changed in the configuration.
import hashlib
with open(pkg_path, 'rb') as f:
pkg_checksum = hashlib.md5(f.read()).hexdigest()

 # copy hello_world.py into the local sandbox
install = Process(
name = 'fetch_package',
cmdline = 'cp %s . && echo %s && chmod +x hello_world.py' % (pkg_path, 
pkg_checksum))

 # run the script
 # cmdline = 'python -u hello_world.py'
{color:#ff0000}hello_world = Process({color}
{color:#ff0000}name = 'hello_world',{color}
{color:#ff0000}cmdline = 'echo "----gang---hello aurora-------";'){color}

 # describe the task
hello_world_task = SequentialTask(
processes = [hello_world],
resources = Resources(cpu = 2, ram = 4096*MB, disk=4096*MB))

jobs = [
Service(cluster = 'devcluster',
environment = 'devel',
role = 'www-data',
name = 'hello_world',
task = hello_world_task)

> dd.mm.yyyy Date format in txt file for .bin model training
> ----------------------------------------------------------
>
>                 Key: OPENNLP-1226
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1226
>             Project: OpenNLP
>          Issue Type: Question
>          Components: Formats
>            Reporter: Olga
>            Priority: Blocker
>              Labels: newbie
>         Attachments: 49B59671-282D-4AEF-BB56-3B9439433F32.jpeg, 
> 73EA1520-1D86-4CF2-84A6-718742ADEE67.jpeg, 
> 88F58024-0D0B-48FB-A382-723D7DADC8AA.jpeg, 
> CE401D05-AF99-4DA2-B1BA-439822AA4275.jpeg
>
>
> My txt file for model training has date tags in <START:date> dd.mm.yyyy <END> 
> format. But when I try to use the trained .bin file, the dates are not 
> extracted as they should. My txt tagged file is written one sentence in line. 
> I was wondering maybe the format, and the fullstops in this date format make 
> a difficulty for the model to learn. In the official OpenNLP documentation I 
> can see there is a bin file with date extraction, but I can't see the txt 
> file containing the tags.
> I tried to open this bin as a txt format but I read in Stack Overflow that I 
> can't do that.
> https://stackoverflow.com/questions/26140492/how-can-i-view-the-content-of-a-bin-file-in-opennlp
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (OPENNLP-1226) dd.mm.yyyy Date format in txt file for .bin model training

Reply via email to