subject:"\[scikit\-learn\] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE"

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

2023-09-18 Thread Ulderico Santarelli

of course. Here it is

Il giorno lun 18 set 2023 alle ore 18:10 Jaime Lopez 
ha scritto:

> Hi,
>
> Same error, maybe it could be related to the database I got from github
> (iris.xlsx), could you share yours?.
>
> [image: image.png]
>
> JL
>
> On Mon, Sep 18, 2023 at 1:57 AM Ulderico Santarelli <
> ulderico.santare...@gmail.com> wrote:
>
>> *I think it better to send you the script in its integrity. I ran now and
>> it works. *
>> *about work it is*
>> work
>> array([[ 5.63011247],
>>[-2.31453939],
>>[22.23122848],
>>[15.37678101]])
>> np.shape(work)
>> (4, 1)
>>
>> *my best regards. *
>> *Ulderico.*
>>
>> _
>> import numpy as np
>> import pandas as pd
>> dataraw = pd.read_excel("C:\Pyth\iris.xlsx")
>> #standardize data --- dataraw is a DataFrame
>> #locate data in the DataFrame
>> datar = dataraw.iloc[:,1:5]
>> means = datar.mean(axis = 0)
>> stdev = datar.std(axis = 0)
>> data = (datar-means)/stdev
>> #keep just quantitative variables
>> #CENTRALITY INDEX
>> scalar = pd.merge(data, data, how = 'cross')
>> point1 = scalar.loc[:, 'sepal length _x':'petal width _x']
>> point2 = scalar.loc[:, 'sepal length _y':'petal width _y']
>> apoint1 = point1.to_numpy(dtype = float)
>> apoint2 = point2.to_numpy(dtype = float)
>> delta = (apoint1 - apoint2)
>> force = 0
>> if delta.any() != 0:
>> force = np.exp(-abs(delta))
>> sig = np.sign(delta)
>> sforce = sig*force
>> dsforce = pd.DataFrame(sforce)
>> #dsforce.to_excel('C:\Pyth\dsforce.xlsx')
>> arr = np.ones((150, 1),)
>> sforcet = sforce.T
>> sum_force =np.zeros((1, 4),)   #do not use empty arrays
>> start = 0
>> end = 150
>> for i in range(150):
>> s_forcet = sforcet[:, start:end]
>> work = np.matmul(s_forcet, arr)
>> sum_force =np.concatenate((sum_force, work.reshape(1, 4)), axis = 0)
>> start = end
>> end +=150
>> sumforce = sum_force[1:, :]
>> dsumforce = pd.DataFrame(sumforce)
>> dsumforce.to_excel('C:\Pyth\sumforce_sqc.xlsx')
>> sum_force_square = sumforce**2
>> ssT = np.ones((4, 1),)
>> T_w_ = np.sqrt(np.matmul(sum_force_square, ssT))
>> dT_w_ = pd.DataFrame(T_w_, )
>> dT_w_.to_excel('C:\Pyth\T_w_.xlsx')
>>
>> Il giorno dom 17 set 2023 alle ore 18:14 Jaime Lopez 
>> ha scritto:
>>
>>> Hi there,
>>>
>>> I got interested in your project, but I found this error from the
>>> beginning (see attached image).
>>> The work array cannot be reshaped to (1,4), cause it has shape (2,1),
>>> any suggestions?
>>>
>>> JL
>>>
>>> [image: image.png]
>>>
>>> On Thu, Sep 14, 2023 at 11:29 AM Ulderico Santarelli <
>>> ulderico.santare...@gmail.com> wrote:
>>>
   *I am an old guy who started programming around the seventies of
 the last century* with ASSEMBLER 360, then FORTRAN, PL1, APL, IBM
 APPLICATION SYSTEM and, last, the marvelous SAS. Having heard around about
 the powerful, flexible, functionally complete PYTHON UNIVERSE”,
 encompassing an advanced Object-Oriented Language and a very wide family of
 packages, I decided to run an exercise about a problem I've been
 tackling since my youth (have a look at the Bibliography). I succeeded in
 completing it in a few days and I'm attaching my solution to the problem of
 finding the points in a sample that are "central" in a surrounding
 topological neighborhood. They are eligible as centroids for a Cluster
 Analysis after the aggregation of "too near points'. The solution is based
 on the search of potential wells in a suitable potential field, similar to
 the one all of us studied in high school. Therefore, too near points may be
 in the same potential well.
 No more words, have a look at the attachment.
 My coding is that of a beginner. I'm sure everybody would find more
 efficient coding.  As a comment: I started studying Python around May 15th
 2023.
 My best regards.
 Ulderico Santarelli.
 ___
 scikit-learn mailing list
 scikit-learn@python.org
 https://mail.python.org/mailman/listinfo/scikit-learn

>>>
>>>
>>> --
>>>
>>> *Jaime Lopez Carvajal*
>>> ___
>>> scikit-learn mailing list
>>> scikit-learn@python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
>
> *Jaime Lopez Carvajal*
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


iris.xlsx
Description: MS-Excel 2007 spreadsheet
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

2023-09-18 Thread Jaime Lopez

Hi,

Same error, maybe it could be related to the database I got from github
(iris.xlsx), could you share yours?.

[image: image.png]

JL

On Mon, Sep 18, 2023 at 1:57 AM Ulderico Santarelli <
ulderico.santare...@gmail.com> wrote:

> *I think it better to send you the script in its integrity. I ran now and
> it works. *
> *about work it is*
> work
> array([[ 5.63011247],
>[-2.31453939],
>[22.23122848],
>[15.37678101]])
> np.shape(work)
> (4, 1)
>
> *my best regards. *
> *Ulderico.*
>
> _
> import numpy as np
> import pandas as pd
> dataraw = pd.read_excel("C:\Pyth\iris.xlsx")
> #standardize data --- dataraw is a DataFrame
> #locate data in the DataFrame
> datar = dataraw.iloc[:,1:5]
> means = datar.mean(axis = 0)
> stdev = datar.std(axis = 0)
> data = (datar-means)/stdev
> #keep just quantitative variables
> #CENTRALITY INDEX
> scalar = pd.merge(data, data, how = 'cross')
> point1 = scalar.loc[:, 'sepal length _x':'petal width _x']
> point2 = scalar.loc[:, 'sepal length _y':'petal width _y']
> apoint1 = point1.to_numpy(dtype = float)
> apoint2 = point2.to_numpy(dtype = float)
> delta = (apoint1 - apoint2)
> force = 0
> if delta.any() != 0:
> force = np.exp(-abs(delta))
> sig = np.sign(delta)
> sforce = sig*force
> dsforce = pd.DataFrame(sforce)
> #dsforce.to_excel('C:\Pyth\dsforce.xlsx')
> arr = np.ones((150, 1),)
> sforcet = sforce.T
> sum_force =np.zeros((1, 4),)   #do not use empty arrays
> start = 0
> end = 150
> for i in range(150):
> s_forcet = sforcet[:, start:end]
> work = np.matmul(s_forcet, arr)
> sum_force =np.concatenate((sum_force, work.reshape(1, 4)), axis = 0)
> start = end
> end +=150
> sumforce = sum_force[1:, :]
> dsumforce = pd.DataFrame(sumforce)
> dsumforce.to_excel('C:\Pyth\sumforce_sqc.xlsx')
> sum_force_square = sumforce**2
> ssT = np.ones((4, 1),)
> T_w_ = np.sqrt(np.matmul(sum_force_square, ssT))
> dT_w_ = pd.DataFrame(T_w_, )
> dT_w_.to_excel('C:\Pyth\T_w_.xlsx')
>
> Il giorno dom 17 set 2023 alle ore 18:14 Jaime Lopez 
> ha scritto:
>
>> Hi there,
>>
>> I got interested in your project, but I found this error from the
>> beginning (see attached image).
>> The work array cannot be reshaped to (1,4), cause it has shape (2,1), any
>> suggestions?
>>
>> JL
>>
>> [image: image.png]
>>
>> On Thu, Sep 14, 2023 at 11:29 AM Ulderico Santarelli <
>> ulderico.santare...@gmail.com> wrote:
>>
>>>   *I am an old guy who started programming around the seventies of
>>> the last century* with ASSEMBLER 360, then FORTRAN, PL1, APL, IBM
>>> APPLICATION SYSTEM and, last, the marvelous SAS. Having heard around about
>>> the powerful, flexible, functionally complete PYTHON UNIVERSE”,
>>> encompassing an advanced Object-Oriented Language and a very wide family of
>>> packages, I decided to run an exercise about a problem I've been
>>> tackling since my youth (have a look at the Bibliography). I succeeded in
>>> completing it in a few days and I'm attaching my solution to the problem of
>>> finding the points in a sample that are "central" in a surrounding
>>> topological neighborhood. They are eligible as centroids for a Cluster
>>> Analysis after the aggregation of "too near points'. The solution is based
>>> on the search of potential wells in a suitable potential field, similar to
>>> the one all of us studied in high school. Therefore, too near points may be
>>> in the same potential well.
>>> No more words, have a look at the attachment.
>>> My coding is that of a beginner. I'm sure everybody would find more
>>> efficient coding.  As a comment: I started studying Python around May 15th
>>> 2023.
>>> My best regards.
>>> Ulderico Santarelli.
>>> ___
>>> scikit-learn mailing list
>>> scikit-learn@python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>
>>
>> --
>>
>> *Jaime Lopez Carvajal*
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


-- 

*Jaime Lopez Carvajal*
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

2023-09-18 Thread Ulderico Santarelli

in addition, *the distance I'm using is not a dogma*. It is meant to avoid
the "black holes syndrome" that would emerge using the sheer Newtonian
distance when by chance two points are too near. When the distance is 0,
exp(-|w-x|) would be 1 and is set to 0. I tried also  exp{-|w-x|^2) but
changes are not significant.
Ulderico.

Il giorno dom 17 set 2023 alle ore 18:14 Jaime Lopez 
ha scritto:

> Hi there,
>
> I got interested in your project, but I found this error from the
> beginning (see attached image).
> The work array cannot be reshaped to (1,4), cause it has shape (2,1), any
> suggestions?
>
> JL
>
> [image: image.png]
>
> On Thu, Sep 14, 2023 at 11:29 AM Ulderico Santarelli <
> ulderico.santare...@gmail.com> wrote:
>
>>   *I am an old guy who started programming around the seventies of
>> the last century* with ASSEMBLER 360, then FORTRAN, PL1, APL, IBM
>> APPLICATION SYSTEM and, last, the marvelous SAS. Having heard around about
>> the powerful, flexible, functionally complete PYTHON UNIVERSE”,
>> encompassing an advanced Object-Oriented Language and a very wide family of
>> packages, I decided to run an exercise about a problem I've been
>> tackling since my youth (have a look at the Bibliography). I succeeded in
>> completing it in a few days and I'm attaching my solution to the problem of
>> finding the points in a sample that are "central" in a surrounding
>> topological neighborhood. They are eligible as centroids for a Cluster
>> Analysis after the aggregation of "too near points'. The solution is based
>> on the search of potential wells in a suitable potential field, similar to
>> the one all of us studied in high school. Therefore, too near points may be
>> in the same potential well.
>> No more words, have a look at the attachment.
>> My coding is that of a beginner. I'm sure everybody would find more
>> efficient coding.  As a comment: I started studying Python around May 15th
>> 2023.
>> My best regards.
>> Ulderico Santarelli.
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
>
> *Jaime Lopez Carvajal*
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

2023-09-18 Thread Ulderico Santarelli

*I think it better to send you the script in its integrity. I ran now and
it works. *
*about work it is*
work
array([[ 5.63011247],
   [-2.31453939],
   [22.23122848],
   [15.37678101]])
np.shape(work)
(4, 1)

*my best regards. *
*Ulderico.*
_
import numpy as np
import pandas as pd
dataraw = pd.read_excel("C:\Pyth\iris.xlsx")
#standardize data --- dataraw is a DataFrame
#locate data in the DataFrame
datar = dataraw.iloc[:,1:5]
means = datar.mean(axis = 0)
stdev = datar.std(axis = 0)
data = (datar-means)/stdev
#keep just quantitative variables
#CENTRALITY INDEX
scalar = pd.merge(data, data, how = 'cross')
point1 = scalar.loc[:, 'sepal length _x':'petal width _x']
point2 = scalar.loc[:, 'sepal length _y':'petal width _y']
apoint1 = point1.to_numpy(dtype = float)
apoint2 = point2.to_numpy(dtype = float)
delta = (apoint1 - apoint2)
force = 0
if delta.any() != 0:
force = np.exp(-abs(delta))
sig = np.sign(delta)
sforce = sig*force
dsforce = pd.DataFrame(sforce)
#dsforce.to_excel('C:\Pyth\dsforce.xlsx')
arr = np.ones((150, 1),)
sforcet = sforce.T
sum_force =np.zeros((1, 4),)   #do not use empty arrays
start = 0
end = 150
for i in range(150):
s_forcet = sforcet[:, start:end]
work = np.matmul(s_forcet, arr)
sum_force =np.concatenate((sum_force, work.reshape(1, 4)), axis = 0)
start = end
end +=150
sumforce = sum_force[1:, :]
dsumforce = pd.DataFrame(sumforce)
dsumforce.to_excel('C:\Pyth\sumforce_sqc.xlsx')
sum_force_square = sumforce**2
ssT = np.ones((4, 1),)
T_w_ = np.sqrt(np.matmul(sum_force_square, ssT))
dT_w_ = pd.DataFrame(T_w_, )
dT_w_.to_excel('C:\Pyth\T_w_.xlsx')

Il giorno dom 17 set 2023 alle ore 18:14 Jaime Lopez 
ha scritto:

> Hi there,
>
> I got interested in your project, but I found this error from the
> beginning (see attached image).
> The work array cannot be reshaped to (1,4), cause it has shape (2,1), any
> suggestions?
>
> JL
>
> [image: image.png]
>
> On Thu, Sep 14, 2023 at 11:29 AM Ulderico Santarelli <
> ulderico.santare...@gmail.com> wrote:
>
>>   *I am an old guy who started programming around the seventies of
>> the last century* with ASSEMBLER 360, then FORTRAN, PL1, APL, IBM
>> APPLICATION SYSTEM and, last, the marvelous SAS. Having heard around about
>> the powerful, flexible, functionally complete PYTHON UNIVERSE”,
>> encompassing an advanced Object-Oriented Language and a very wide family of
>> packages, I decided to run an exercise about a problem I've been
>> tackling since my youth (have a look at the Bibliography). I succeeded in
>> completing it in a few days and I'm attaching my solution to the problem of
>> finding the points in a sample that are "central" in a surrounding
>> topological neighborhood. They are eligible as centroids for a Cluster
>> Analysis after the aggregation of "too near points'. The solution is based
>> on the search of potential wells in a suitable potential field, similar to
>> the one all of us studied in high school. Therefore, too near points may be
>> in the same potential well.
>> No more words, have a look at the attachment.
>> My coding is that of a beginner. I'm sure everybody would find more
>> efficient coding.  As a comment: I started studying Python around May 15th
>> 2023.
>> My best regards.
>> Ulderico Santarelli.
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
>
> *Jaime Lopez Carvajal*
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

2023-09-17 Thread Ulderico Santarelli

I'm going to have a look at this. Thank you for your comment.


Il giorno dom 17 set 2023 alle ore 18:14 Jaime Lopez 
ha scritto:

> Hi there,
>
> I got interested in your project, but I found this error from the
> beginning (see attached image).
> The work array cannot be reshaped to (1,4), cause it has shape (2,1), any
> suggestions?
>
> JL
>
> [image: image.png]
>
> On Thu, Sep 14, 2023 at 11:29 AM Ulderico Santarelli <
> ulderico.santare...@gmail.com> wrote:
>
>>   *I am an old guy who started programming around the seventies of
>> the last century* with ASSEMBLER 360, then FORTRAN, PL1, APL, IBM
>> APPLICATION SYSTEM and, last, the marvelous SAS. Having heard around about
>> the powerful, flexible, functionally complete PYTHON UNIVERSE”,
>> encompassing an advanced Object-Oriented Language and a very wide family of
>> packages, I decided to run an exercise about a problem I've been
>> tackling since my youth (have a look at the Bibliography). I succeeded in
>> completing it in a few days and I'm attaching my solution to the problem of
>> finding the points in a sample that are "central" in a surrounding
>> topological neighborhood. They are eligible as centroids for a Cluster
>> Analysis after the aggregation of "too near points'. The solution is based
>> on the search of potential wells in a suitable potential field, similar to
>> the one all of us studied in high school. Therefore, too near points may be
>> in the same potential well.
>> No more words, have a look at the attachment.
>> My coding is that of a beginner. I'm sure everybody would find more
>> efficient coding.  As a comment: I started studying Python around May 15th
>> 2023.
>> My best regards.
>> Ulderico Santarelli.
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
>
> *Jaime Lopez Carvajal*
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

2023-09-17 Thread Jaime Lopez

Hi there,

I got interested in your project, but I found this error from the beginning
(see attached image).
The work array cannot be reshaped to (1,4), cause it has shape (2,1), any
suggestions?

JL

[image: image.png]

On Thu, Sep 14, 2023 at 11:29 AM Ulderico Santarelli <
ulderico.santare...@gmail.com> wrote:

>   *I am an old guy who started programming around the seventies of
> the last century* with ASSEMBLER 360, then FORTRAN, PL1, APL, IBM
> APPLICATION SYSTEM and, last, the marvelous SAS. Having heard around about
> the powerful, flexible, functionally complete PYTHON UNIVERSE”,
> encompassing an advanced Object-Oriented Language and a very wide family of
> packages, I decided to run an exercise about a problem I've been tackling
> since my youth (have a look at the Bibliography). I succeeded in completing
> it in a few days and I'm attaching my solution to the problem of finding
> the points in a sample that are "central" in a surrounding topological
> neighborhood. They are eligible as centroids for a Cluster Analysis after
> the aggregation of "too near points'. The solution is based on the search
> of potential wells in a suitable potential field, similar to the one all of
> us studied in high school. Therefore, too near points may be in the same
> potential well.
> No more words, have a look at the attachment.
> My coding is that of a beginner. I'm sure everybody would find more
> efficient coding.  As a comment: I started studying Python around May 15th
> 2023.
> My best regards.
> Ulderico Santarelli.
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


-- 

*Jaime Lopez Carvajal*
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

2023-09-14 Thread Ulderico Santarelli

  *I am an old guy who started programming around the seventies of the
last century* with ASSEMBLER 360, then FORTRAN, PL1, APL, IBM APPLICATION
SYSTEM and, last, the marvelous SAS. Having heard around about the
powerful, flexible, functionally complete PYTHON UNIVERSE”, encompassing an
advanced Object-Oriented Language and a very wide family of packages, I
decided to run an exercise about a problem I've been tackling since my
youth (have a look at the Bibliography). I succeeded in completing it in a
few days and I'm attaching my solution to the problem of finding the points
in a sample that are "central" in a surrounding topological neighborhood.
They are eligible as centroids for a Cluster Analysis after the aggregation
of "too near points'. The solution is based on the search of
potential wells in a suitable potential field, similar to the one all of us
studied in high school. Therefore, too near points may be in the same
potential well.
No more words, have a look at the attachment.
My coding is that of a beginner. I'm sure everybody would find more
efficient coding.  As a comment: I started studying Python around May 15th
2023.
My best regards.
Ulderico Santarelli.


SAMPLE POINTS CENTRALITY INDEX.docx
Description: MS-Word 2007 document
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

Re: [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

[scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

7 matches

Site Navigation

Mail list logo

Footer information