[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module

2011-10-26 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module

2011-11-10 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module

2011-11-16 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module

2011-11-16 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module

2011-11-18 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module

2011-11-23 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:zorba/data-cleaning-module into lp:~diogo-simoes89/zorba/DC

2012-02-01 Thread Diogo Simões
Diogo Simões has proposed merging lp:zorba/data-cleaning-module into 
lp:~diogo-simoes89/zorba/DC.

Requested reviews:
  Diogo Simões (diogo-simoes89)

For more details, see:
https://code.launchpad.net/~zorba-coders/zorba/data-cleaning-module/+merge/91121

Changes in the tests of conversion module:
- address-from-phone
- address-from-user
- phone-from-address
- phone-from-user
- user-from-address
- user-from-phone

These changes support variations of the webservices results
-- 
https://code.launchpad.net/~zorba-coders/zorba/data-cleaning-module/+merge/91121
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.
-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:zorba/data-cleaning-module into lp:~diogo-simoes89/zorba/DC

2012-02-01 Thread Diogo Simões
The proposal to merge lp:zorba/data-cleaning-module into 
lp:~diogo-simoes89/zorba/DC has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~zorba-coders/zorba/data-cleaning-module/+merge/91121
-- 
https://code.launchpad.net/~zorba-coders/zorba/data-cleaning-module/+merge/91121
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module

2012-02-01 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/DC into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124
Your team Zorba Coders is requested to review the proposed merge of 
lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


Re: [Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module

2012-02-01 Thread Diogo Simões

Thanks Chris.

Guess it is done now

 To: mp+91...@code.launchpad.net
 From: chillery+launch...@lambda.nu
 Subject: Re: [Merge] lp:~diogo-simoes89/zorba/DC into 
 lp:zorba/data-cleaning-module
 Date: Wed, 1 Feb 2012 17:18:20 +
 
 Diogo, you need to set the commit message for the merge proposal in order for 
 the validation queue to run.
 -- 
 https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124
 You are the owner of lp:~diogo-simoes89/zorba/DC.
  
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


Re: [Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module

2012-02-01 Thread Diogo Simões
Review: Approve

Changes in the tests of conversion module:
- address-from-phone
- address-from-user
- phone-from-address
- phone-from-user
- user-from-address
- user-from-phone

These changes support variations of the webservices results
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124
Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC-conversion-tests into lp:zorba/data-cleaning-module

2012-02-05 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/DC-conversion-tests into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/DC-conversion-tests/+merge/91599
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/DC-conversion-tests/+merge/91599
Your team Zorba Coders is requested to review the proposed merge of 
lp:~diogo-simoes89/zorba/DC-conversion-tests into lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module

2012-04-03 Thread Diogo Simões
Diogo Simões has proposed merging 
lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into 
lp:zorba/data-cleaning-module.

Requested reviews:
  Zorba Coders (zorba-coders)

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683

This revision includes a new normalization function: capitalize($string as 
xs:string) as xs:string.

It also includes the thesaurus-based module, with the check-related ( $s1 as 
xs:string, $s2 as xs:string, $uri as xs:string, $type as xs:string ) and the 
related-terms ( $s1 as xs:string, $uri as xs:string, $type as xs:string ) 
functions.
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683
Your team Zorba Coders is requested to review the proposed merge of 
lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into 
lp:zorba/data-cleaning-module.
=== modified file 'src/com/zorba-xquery/www/modules/data-cleaning/normalization.xq'
--- src/com/zorba-xquery/www/modules/data-cleaning/normalization.xq	2011-11-08 21:16:29 +
+++ src/com/zorba-xquery/www/modules/data-cleaning/normalization.xq	2012-04-03 20:16:21 +
@@ -31,12 +31,34 @@
 module namespace normalization = http://www.zorba-xquery.com/modules/data-cleaning/normalization;;
 
 import module namespace http = http://www.zorba-xquery.com/modules/http-client;;
+import module namespace ft = http://www.zorba-xquery.com/modules/full-text;;
 
 declare namespace ann = http://www.zorba-xquery.com/annotations;;
 declare namespace ver = http://www.zorba-xquery.com/options/versioning;;
 declare option ver:module-version 2.0;
 
 (:~
+: Converts a given string into a capitalized representation.
+:
+: @param $string The string to be capitalized.
+:
+: @return The string resulting from the conversion.
+: @example test/Queries/data-cleaning/normalization/capitalize.xq
+:)
+declare function normalization:capitalize ($string as xs:string) as xs:string{
+  let $ttokens := tokenize ($string,  )
+  let $cap-tokens :=
+for $toks in $ttokens[position()1]
+let $capitalized-tokens := 
+  if (not(ft:is-stop-word($toks)))
+  then concat(upper-case(substring($toks, 1,1)), substring(lower-case($toks), 2),  )
+  else concat(lower-case($toks),  )
+return $capitalized-tokens
+  let $cap-string := concat(concat(upper-case(substring($ttokens[position()=1], 1,1)), substring(lower-case($ttokens[position()=1]), 2),  ), string-join($cap-tokens))
+  return substring($cap-string, 1, string-length($cap-string)-1)
+};
+
+(:~
  : Converts a given string representation of a date value into a date representation valid according 
  : to the corresponding XML Schema type.
  :

=== added file 'src/com/zorba-xquery/www/modules/data-cleaning/thesaurus-based.xq'
--- src/com/zorba-xquery/www/modules/data-cleaning/thesaurus-based.xq	1970-01-01 00:00:00 +
+++ src/com/zorba-xquery/www/modules/data-cleaning/thesaurus-based.xq	2012-04-03 20:16:21 +
@@ -0,0 +1,74 @@
+(:
+ : Copyright 2006-2009 The FLWOR Foundation.
+ :
+ : Licensed under the Apache License, Version 2.0 (the License);
+ : you may not use this file except in compliance with the License.
+ : You may obtain a copy of the License at
+ :
+ : http://www.apache.org/licenses/LICENSE-2.0
+ :
+ : Unless required by applicable law or agreed to in writing, software
+ : distributed under the License is distributed on an AS IS BASIS,
+ : WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ : See the License for the specific language governing permissions and
+ : limitations under the License.
+ :)
+
+(:~
+ : This library module provides thesaurus functions for checking semantic relations between strings 
+ : and for checking abbreviations.
+
+ : These functions are particularly useful in tasks related to the creation of semantic mappings.
+ : 
+ :
+ : @author Bruno Martins and Diogo Simões
+ :)
+
+module namespace thesaurus = http://www.zorba-xquery.com/modules/data-cleaning/thesaurus;;
+
+import module namespace ft = http://www.zorba-xquery.com/modules/full-text;;
+
+(:~
+ : Checks if two strings have a relationship defined in a given thesaurus.
+ : The implementation of this function depends on the full-text module.
+ :
+ :
+ : @param $s1 The first string.
+ : @param $s2 The second string.
+ : @param $uri The uri of the thesaurus to be considered.
+ : @param $type An identifyer for the type of relationship.
+ :
+ : @return true if the first string has the provided relationship with the second string defined in the thesaurus and false otherwise.
+ : @example test/Queries/data-cleaning/thesaurus-based/check-related.xq 
+ : 
+ :)
+declare function thesaurus:check-related ( $s1 as xs:string, $s2 as xs:string, $uri as xs:string, $type as xs:string ) as xs:boolean {
+  let $relation := ft:thesaurus-lookup( $uri,
+  $s2,
+  xs:language(en),
+  $type )
+  return $relation = $s1

[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module

2012-04-17 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into 
lp:zorba/data-cleaning-module has been updated.

Status: Approved = Needs review

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683
Your team Zorba Coders is requested to review the proposed merge of 
lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into 
lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module

2012-04-17 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Approved

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683
Your team Zorba Coders is requested to review the proposed merge of 
lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into 
lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module

2012-04-27 Thread Diogo Simões
The proposal to merge lp:~diogo-simoes89/zorba/DC-documentation into 
lp:zorba/data-cleaning-module has been updated.

Status: Needs review = Rejected

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/DC-documentation/+merge/103728
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/DC-documentation/+merge/103728
Your team Zorba Coders is requested to review the proposed merge of 
lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module.

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp


[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module

2012-04-27 Thread Diogo Simões
Diogo Simões has proposed merging lp:~diogo-simoes89/zorba/DC-documentation 
into lp:zorba/data-cleaning-module.

Requested reviews:
  Zorba Coders (zorba-coders)

For more details, see:
https://code.launchpad.net/~diogo-simoes89/zorba/DC-documentation/+merge/103902

Addition of return types in functions signatures:
 Applied in conversion, consolidation and set-similarity modules.
-- 
https://code.launchpad.net/~diogo-simoes89/zorba/DC-documentation/+merge/103902
Your team Zorba Coders is requested to review the proposed merge of 
lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module.
=== modified file 'src/com/zorba-xquery/www/modules/data-cleaning/consolidation.xq'
--- src/com/zorba-xquery/www/modules/data-cleaning/consolidation.xq	2011-08-01 11:26:53 +
+++ src/com/zorba-xquery/www/modules/data-cleaning/consolidation.xq	2012-04-27 15:23:42 +
@@ -50,7 +50,7 @@
  : @return The most frequent node in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/most-frequent.xq
  :)
-declare function con:most-frequent ( $s ) {
+declare function con:most-frequent ( $s ) as item(){
  (for $str in set:distinct($s) order by count($s[deep-equal(.,$str)]) descending return $str)[1]
 };
 
@@ -67,7 +67,7 @@
  : @return The least frequent node in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/leastfrequent_1.xq
  :)
-declare function con:least-frequent ( $s ) {
+declare function con:least-frequent ( $s ) as item(){
  let $aux := for $str in set:distinct($s) order by count($s[deep-equal(.,$str)]) return $str
  return if (count($aux) = 0) then () else ($aux[1])
 };
@@ -242,7 +242,7 @@
  : @return The node having the largest number of descending elements in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/most-elements.xq
  :)
-declare function con:most-elements ( $s ) {
+declare function con:most-elements ( $s ) as element(){
  (for $str in set:distinct($s) order by count($str/descendant-or-self::element()) descending return $str)[1]
 };
 
@@ -260,7 +260,7 @@
  : @return The node having the largest number of descending attributes in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/most-attributes.xq
  :)
-declare function con:most-attributes ( $s ) {
+declare function con:most-attributes ( $s ) as element(){
  (for $str in set:distinct($s) order by count($str/descendant-or-self::*/attribute()) descending return $str)[1]
 };
 
@@ -278,7 +278,7 @@
  : @return The node having the largest number of descending nodes in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/most-nodes.xq
  :)
-declare function con:most-nodes ( $s ) {
+declare function con:most-nodes ( $s ) as element(){
  (for $str in set:distinct($s) order by count($str/descendant-or-self::node()) descending return $str)[1]
 };
 
@@ -296,7 +296,7 @@
  : @return The node having the smallest number of descending elements in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/least-elements.xq
  :)
-declare function con:least-elements ( $s ) {
+declare function con:least-elements ( $s ) as element(){
  (for $str in set:distinct($s) order by count($str/descendant-or-self::element()) return $str)[1]
 };
 
@@ -314,7 +314,7 @@
  : @return The node having the smallest number of descending attributes in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/least-attributes.xq
  :)
-declare function con:least-attributes ( $s ) {
+declare function con:least-attributes ( $s ) as element(){
  (for $str in set:distinct($s) order by count($str/descendant-or-self::*/attribute()) return $str)[1]
 };
 
@@ -332,7 +332,7 @@
  : @return The node having the smallest number of descending nodes in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/least-nodes.xq
  :)
-declare function con:least-nodes ( $s ) {
+declare function con:least-nodes ( $s ) as element(){
  (for $str in set:distinct($s) order by count($str/descendant-or-self::node()) return $str)[1]
 };
 
@@ -350,7 +350,7 @@
  : @return The node having the largest number of distinct descending elements in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/most-distinct-elements.xq
  :)
-declare function con:most-distinct-elements ( $s ) {
+declare function con:most-distinct-elements ( $s ) as element(){
  (for $str in set:distinct($s) order by count(set:distinct($str/descendant-or-self::element())) descending return $str)[1]
 };
 
@@ -368,7 +368,7 @@
  : @return The node having the largest number of distinct descending attributes in the input sequence.
  : @example test/Queries/data-cleaning/consolidation/most-distinct-attributes.xq
  :)
-declare function con:most-distinct-attributes ( $s ) {
+declare function con:most-distinct-attributes ( $s ) as element(){
  (for $str in set:distinct($s) order by count(set:distinct($str/descendant-or-self::*/attribute())) descending return $str)[1