Re: [gobolinux-devel] Code Documentation Project: String patch

Kevin Quick Sun, 13 Apr 2008 11:25:53 -0700

Daniele,

A couple of English corrections (it's a crazy language!) and typos,  
and a few Python recommendations.


One specifically: since you're willing to use python herein, just use  
python for the get_token: resolves ambiguity over negative indexing/ 
bounds and much simpler as well.  :)

Comments interspersed below.

-KQ

On 12 Apr 2008, at 9:23 AM, Daniele Maccari wrote:
> Here's a patch for the String script.
> I noticed what I think could be considered a little bug in the  
> Get_Token function, at the very last line. Look at the comments and  
> please let me know if I got it right. Basically we want negative  
> indexes to be interpreted a la python, but we must take care of out  
> of bounds errors.
>
> Bye
>
> P.S.: please let me know if you feel more comfortable with patches  
> involving more files at once, rather than single ones.
> --- trunk/Scripts/Functions/String    2008-04-11 16:47:47.000000000 +0200
> +++ trunk.new/Scripts/Functions/String        2008-04-12  
> 18:11:04.000000000 +0200
> @@ -1,73 +1,215 @@
>  #!/bin/bash (source)
>
> -# //////////////////////////////////////////////////
> -# STRING OPERATIONS
> -# //////////////////////////////////////////////////
> +##################################################################### 
> #########
> +# File:
> +#       Strings
> +# Description:
> +#       This script contains a bunch of utility functions to  
> conveniently
> +#       handle strings and operations on them.
> +# Usage:
> +#       See each function's header for a detailed description of  
> its usage.
> +##################################################################### 
> ########
> +
> +##################################################################### 
> ########
> +# Name:
> +#       Is_Empty
> +# Description:
> +#       Simply checks whether the passed string is zero-lengthed.

zero-length.

> +# Arguments:
> +#       string -- string : The string to be checked.
> +# Output:
> +#       none.
> +# Return:
> +#       true  : the string has length 0.
> +#       false : the string has length different from 0.
> +# Usage:
> +#       Is_Empty <string>
> +# Example:
> +#       Is_Empty "foobar"
> +#       Will return false.
> +# Example:
> +#       Is_Empty ""
> +#       Will return true.
> +##################################################################### 
> #########
>
>  function Is_Empty() { [ -z "$*" ] ;}
>
> -# example: Starts_With "oper" "operations"
> +##################################################################### 
> #########
> +# Name:
> +#       Starts_With
> +# Description:
> +#       Checks whether the passed string starts with the passed  
> substring.
> +# Arguments:
> +#       start  -- string : The substring the string must start with.
> +#       string -- string : The string to be checked.
> +# Output:
> +#       true  : the string starts with <start>.
> +#       false : the string doesn't start with <start>.
> +# Return:
> +#       none.
> +# Usage:
> +#       Starts_With <start> <string>
> +# Example:
> +#       Starts_With "oper" "operations"
> +#       Will return true.
> +##################################################################### 
> #########
> +
>  function Starts_With() {
> -   l=${#1}
> +   # The first l characters in <string> must match <start>.
> +   l=${#1}
>     [[ "${2:0:l}" = "$1" ]]
>  }
>
> +##################################################################### 
> #########
> +# Name:
> +#       Ends_With
> +# Description:
> +#       Checks whether the passed string ends with the passed  
> substring.
> +# Arguments:
> +#       end    -- string : The substring the string must end with.
> +#       string -- strung : The string to be checked.
> +# Ouput:
> +#       none.
> +# Return:
> +#       true  : the string ends with <end>.
> +#       false : the string doesn't end with <end>.
> +# Usage:
> +#       Ends_With <end> <string>
> +# Example:
> +#       Ends_With "tions" "operations"
> +#       Will return true.
> +##################################################################### 
> #########
> +
>  #detsch, 23/08/2004
> -# example: Ends_With "tions" "operations"
>  function Ends_With() {
> -   l2=${#2}
> +   # The last l3-l2 characters in <string> must match <end>.
> +   l2=${#2}
>     l3=$[ ${#2} - ${#1} ]
>     [[ "${2:l3:l2}" = "$1" ]]
>  }
>
> +##################################################################### 
> #########
> +# Name:
> +#       Has_Uppercase
> +# Description:
> +#       Checks whether the passed string contains a capitalized  
> character.
> +# Arguments:
> +#       string -- string : The string to be checked.
> +# Output:
> +#       none.
> +# Return:
> +#       true  : the string contains at least one capitalized  
> character.
> +#       false : the strong doesn't contain any capitalized character.

strong -> string

> +# Usage:
> +#       Has_Uppercase <string>
> +# Notes:
> +#       Please note that the capitalized character might appear  
> anywhere
> +#       inside <string>.
> +# Example:
> +#       Has_Uppercase "upperCase"
> +#       Will return true.
> +##################################################################### 
> #########
> +
>  function Has_Uppercase() {
>     echo "$1" | grep "[[:upper:]]" &> /dev/null
>  }
>
> +##################################################################### 
> #########
> +# Name:
> +#       Capitalize
> +# Description:
> +#       Converts a string to the correspondent capitalized sister.
> +# Arguments:
> +#       string -- string : The string to be capitalized.
> +# Output:
> +#       The capitalized string.
> +# Return:
> +#       none.
> +# Notes:
> +#       Note that the passed string will be splitted if it  
> contains '-' or '_'
> +#       and the single parts will be printed joined and capitalized.
> +# Usage:
> +#       Capitalize <string>
> +# Example:
> +#       Capitalize "upper_case"
> +#       Will return "UpperCase".
> +##################################################################### 
> #########
> +
>  function Capitalize() {
> +   # Using python for convenience.
>     python -c "
>
>  import sys,string
>
>  word = sys.argv[1]
>
> +# Split a word when encountering a '-' or '_' character.
>  def breakWord(word):
>      for ch in ['-', '_']:
>          pos = string.find(word, ch)
>          if pos != -1:
> +            # We've found one of the needed chars.
> +            # Return a list containing the string part before ch,
> +            # then ch, and the result of the subsequent split of
> +            # the remaining part of the string.
>              return [ word[:pos], ch ] + breakWord(word[pos+1:])
> +    # The string couldn't be splitted.
>      return [ word ]
>
>  parts = breakWord(word)
> +# Print every part capitalized, one after the other.
>  for part in parts:
>      sys.stdout.write(string.capitalize(part))
>
>  " "$1"
>  }

Since you're willing to use python, the above could be:

python -c "import string,sys; print '-'.join(map(lambda w:  
string.capwords(w, '_'), sys.argv[1].split('-')))" "$1"

>
> -# Split_String
> -# splits a string into an array and places the result in the  
> varable specified
> -# if <token> is ommitted whitespace is used
> +##################################################################### 
> #########
> +# Name:
> +#       Split_String
> +# Description:
> +#       Splits a string into an array and places the result in the  
> variable
> +#       specified. If the token is ommitted whitespace is used.

omitted

> +# Arguments:
> +#       array  -- string : The name of the array.
> +#       string -- string : The string to be capitalized.

capitalized->split

> +#       token  -- char   : The token used to split the string.
> +# Output:
> +#       The splitted string.

The split string.

> +# Return:
> +#       none.
> +# Notes:
> +#       Note that this function should be evaluated rather than  
> being used
> +#       direclty.

directly.

>  # Usage:
> -# eval $(Split_String <name of variable> <string> [<token>])
> +#       eval $(Split_String <array> <string> [<token>])
> +# Example:
> +#       eval $(Split_String split "a list of strings"
> +#       Will return a bash array of the form split=("a" "list"  
> "of" "strings").
> +##################################################################### 
> #########
> +
>  function Split_String {
> -   string_name="$1"
> +   array_name="$1"
>     string="$2"
>     token="$3"
>     if [ -z "$token" ]
>     then
>        token=" "
> +      # Subsitute multiple blanks with a single one.
>        string=$(echo "${string}" | sed -r 's/\s+/ /g')
>     fi
>
> -   echo -n "${string_name}=("
> +   echo -n "${array_name}=("
>     while index=$(expr index "${string}" "${token}")
>           [ $index -gt 0 ]
>     do
> +      # While there are tokens left, echo the part of <string>  
> starting
> +      # from 0 to the first token, then replace <string> with its  
> part
> +      # following the token, included.
>        echo -n "\"${string:0:((index-1))}\" "
>        string="${string:index}"
>     done
> -   echo -n "\"${string}\""
> +   echo -n "\"${string}\"" # the part of <string> following the  
> last token.
>     echo ")"
>  }

And again, with python:

function Split_String {
   python -c "import sys; print sys.argv[1]+'='+tuple(sys.argv 
[2].split(sys.argv[3]))" "$1" "$2" "$3"
}

>
> @@ -82,7 +224,7 @@
>  #}
>
>  #detsch, 23/08/2004
> -# shell/sed/awk/cut implementation is welcome, but don't forget
> +# shell/sed/awk/cut implementation is welcome, but don't forget
>  # negative values at "$3"
>  #function Get_Token() {
>  #   python -c "
> @@ -91,39 +233,130 @@
>  #" "$1" "$2" "$3"
>  #}
>
> -# splits $1 by $2, returns entry $3 (may be negative)
> +##################################################################### 
> #########
> +# Name:
> +#       Get_Token
> +# Description:
> +#       Search for the specified <position>-th part in a string  
> splitted by

splitted -> split

> +#       the specified token, and prints it when found.
> +# Arguments:
> +#       string   -- string : The string to be splitted.

splitted -> split

> +#       token    -- char   : The token used to split the string.
> +#       position -- number : The position of the part we want to get.
> +# Output:
> +#       The <position>-th part, if present. Nothing otherwise.
> +# Return:
> +#       none.
> +# Notes:
> +#       Note that <position> can also assume negative values, in  
> which case
> +#       it is interpreted starting from the last part, with -1  
> indicating
> +#       it.

it -> "the last token in the string."

> +# Usage:
> +#       Get_Token <string> <token> <position>
> +# Example:
> +#       Get_Token "a list of strings" " " 2
> +#       Will return "list".
> +# Example:
> +#       Get_Token "a list of strings" " " -2
> +#       Will return "of".
> +##################################################################### 
> #########
> +
>  function Get_Token() {
>     local i=0 tokens=() len=${#2}
>     local start rest=$1
> +   # Initially, the remainder is <string>.
>     while [ "$rest" ]
>     do
> +       # If we have a remainder, split it by <token>
>         start=${rest%%$2*}
>         if [ $i -eq $3 ]
>         then
> +           # We've found the part we were looking for.
>             echo -n $start
>             return 0
>         fi
> +       # Else put the left part into <tokens>, and take the  
> remainder.
>         tokens[$i]=$start
>         rest=${rest:${#start}}
> +
>         if [ "$rest" ]
>         then
> +           # If we have a remainder, remove <token> from it.
> +           # Then start all over again.
>             rest=${rest:$len}
>             let i++
>         fi
>     done
> -   [ $3 -lt 0 ] && echo -n ${tokens[$(( $i+$3+1 ))]}
> +   # A negative <position> means the last but <position> part.
> +   # So with an array of 5 elements -3 would mean 3rd.
> +   # Of course it can't go outside bounds.
> +   # Note that we're able to do this control only when splitting is
> +   # finished, since otherwise we cannot know parts number.
> +   [ $3 -lt 0 -a $3 -lt ${#tokens} ] && echo -n ${tokens[$(( $i+$3 
> +1 ))]}
>  }

Another python:

python -c "import sys; print sys.argv[1].split(sys.argv[2])[int 
(sys.argv[3])]" "$1" "$2" "$3"

>
> -# writes each string passed to it, but with the last character  
> removed.
> +##################################################################### 
> #########
> +# Name:
> +#       Chop
> +# Description:
> +#       Returns the string passed to it, but with the last  
> character removed.
> +# Arguments:
> +#       string   -- string : The string to be chopped.
> +# Output:
> +#       The chopped string.
> +# Return:
> +#       none.
> +# Usage:
> +#       Chop <string>
> +# Example:
> +#       Chop "string"
> +#       Will return "strin".
> +##################################################################### 
> #########
> +
>  function Chop() {
>     echo -n "$(echo "$*" | sed -re "s/(.*)./\1/g")"
>  }
>
> -# Returns the arguments passed to it converted to lower case
> +##################################################################### 
> #########
> +# Name:
> +#       Downcase
> +# Description:
> +#       Returns the string passed to it, lower case.
> +# Arguments:
> +#       string   -- string : The string to be put lower case.
> +# Output:
> +#       The converted string.
> +# Return:
> +#       none.
> +# Usage:
> +#       Downcase <string>
> +# Example:
> +#       Downcase "STRING"
> +#       Will return "string".
> +##################################################################### 
> #########
> +
>  function Downcase() {
>     echo "$*" | tr '[:upper:]' '[:lower:]'
>  }
>
> +##################################################################### 
> #########
> +# Name:
> +#       Uppercase
> +# Description:
> +#       Returns the string passed to it, upper case.
> +# Arguments:
> +#       string   -- string : The string to be put upper case.
> +# Output:
> +#       The converted string.
> +# Return:
> +#       none.
> +# Usage:
> +#       Uppercase <string>
> +# Example:
> +#       Uppercase "string"
> +#       Will return "STRING".
> +##################################################################### 
> #########
> +
>  function Uppercase() {
>     echo "$*" | tr '[:lower:]' '[:upper:]'
>  }
>
_______________________________________________
gobolinux-devel mailing list
[email protected]
http://lists.gobolinux.org/mailman/listinfo/gobolinux-devel

Re: [gobolinux-devel] Code Documentation Project: String patch

Reply via email to