Re: (È«º¸)ÃÖ°­È«º¸ÇÁ·Î±×·¥!!È«º¸°ÆÁ¤³¡.

2002-02-27 Thread Jim Snow

This is a multi-part message in MIME format.

--=_NextPart_000_0017_01C1BFC6.F446E040
Content-Type: text/plain;
charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

O=BA=BB=B8=DE=C0=CF=C0=BA=C1=A4=BA=B8=C5=EB=BD=C5=B8=C1=C0=CC=BF=EB=C3=CB=
=C1=F8=B9=D7=C1=A4=BA=B8=BA=B8=C8=A3=B5=EE=BF=A1=B0=FC=C7=D1=B9=FD=B7=FC=C1=
=A650=C1=B6=BF=A1=C0=C7=B0=C5=C7=D1[=B1=A4=B0=ED]=B8=DE=C0=CF=C0=D4=B4=CF=
=B4=D9BLANK
  [EMAIL PROTECTED] wrote in message =
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
  O =BA=BB =B8=DE=C0=CF=C0=BA =C1=A4=BA=B8=C5=EB=BD=C5=B8=C1 =
=C0=CC=BF=EB=C3=CB=C1=F8 =B9=D7 =C1=A4=BA=B8=BA=B8=C8=A3 =B5=EE=BF=A1 =
=B0=FC=C7=D1 =B9=FD=B7=FC =C1=A6 50=C1=B6=BF=A1 =C0=C7=B0=C5=C7=D1 =
[=B1=A4=B0=ED] =B8=DE=C0=CF=C0=D4=B4=CF=B4=D9
  O e-mail=C1=D6=BC=D2=B4=C2 =C0=CE=C5=CD=B3=DD=BB=F3=BF=A1=BC=AD =
=C3=EB=B5=E6=C7=CF=BF=B4=C0=B8=B8=E7, =C1=D6=BC=D2=BF=DC =
=BE=EE=B6=B0=C7=D1 =B0=B3=C0=CE =C1=A4=BA=B8=B5=B5 =B0=A1=C1=F6=B0=ED =
=C0=D6=C1=F6 =BE=CA=BD=C0=B4=CF=B4=D9
  =BC=F6=BD=C5=B0=C5=BA=CE=B8=A6 =BF=F8=C7=CF=BD=C3=B8=E9 =
=BE=C6=B7=A1=BF=A1=BC=AD =BC=F6=BD=C5=B0=C5=BA=CE =C7=D8 =
=C1=D6=BC=BC=BF=E4.=C1=A4=BA=B8=B8=A6 =BF=F8=C4=A1 =BE=CA=B4=C2 =
=BA=D0=B2=B2=B4=C2 =B4=EB=B4=DC=C8=F7 =C1=CB=BC=DB =C7=D5=B4=CF=B4=D9.
=A2=BF=A2=BF=A2=BF =C8=AB=BA=B8 =B6=A7=B9=AE=BF=A1 =B0=C6=C1=A4 =
=C7=CF=BC=CC=B3=AA=BF=E4? =C0=CC=C1=A8 =B0=C6=C1=A4 =B8=B6=BC=BC=BF=E4. =
=A2=BF=A2=BF=A2=BF
=C8=AB=BA=B8=BF=A1 =B4=EB=C7=D1 =B8=F0=B5=E7=B0=CD=B0=FA =
=B3=EB=C7=CF=BF=EC =BF=A9=B1=E2 =B4=D9 =C0=D6=BD=C0=B4=CF=B4=D9.=20
=B9=AB=BE=FA=C0=CC=B5=E7=C1=F6 =B9=B0=BE=EE =BA=B8=BC=BC=BF=E4.  =
mailto:[EMAIL PROTECTED]

=A2=BA=A2=BA=A2=BA =C0=CC=B9=F8=BF=A1 =
=C8=AB=BA=B8=B4=EB=C7=E0=BE=F7 =C0=B8=B7=CE =
=C0=FC=C8=AF=C7=D4=BF=A1=B5=FB=B6=F3=20
3=B3=E2=B5=BF=BE=C8  =B8=F0=BE=C6=B3=F5=C0=BA =
=C8=AB=BA=B8=C7=C3=B1=D7=B7=A5=C0=BB =BF=B0=B0=A1=B7=CE  =
=B4=D9=B5=E5=B8=B2=B4=CF=B4=D9.=A2=B8=A2=B8=A2=B8
  =20
=A2=BE=A2=BE=A2=BE =C8=AB=BA=B8=C3=CA=BA=B8=BF=EB =
=A2=BE=A2=BE=A2=BE=20

=A2=BD=C0=CC=B8=E1=C3=DF=C3=E2=B1=E22=B0=B3 =
=A2=BD=C0=CC=B8=E1=C6=ED=C1=FD=B1=E21=B0=B3 =
=A2=BD=C0=CC=B8=E1=B9=DF=BC=DB=B1=E22=B0=B3(=C1=A4=C7=B01,=B5=A5=B8=F01) =

=A2=BD=C0=CC=B8=E1=B8=AE=BD=BA=C6=AE50=B8=B8=B0=B3 =
=A2=BD=B0=D4=BD=C3=C6=C7=B5=EE=B7=CF=B1=E21=B0=B3 =
=A2=BD=B0=D4=BD=C3=C6=C7=B5=F0DB2000=B0=B3

=A2=D1 =C0=A7=C0=C7 =B8=F0=B5=E7=B0=CD=C0=BB =
10=B8=B8=BF=F8=BF=A1 =B4=D9 =B5=E5=B8=B3=B4=CF=B4=D9. =A2=D0
  =20
=A2=BE=A2=BE=A2=BE =C8=AB=BA=B8=C1=DF=B1=DE=BF=EB =
=A2=BE=A2=BE=A2=BE

=A2=BD=C0=CC=B8=E1=C3=DF=C3=E2=B1=E23=B0=B3 =
=A2=BD=C0=CC=B8=E1=C6=ED=C1=FD=B1=E21=B0=B3 =
=A2=BD=C0=CC=B8=E1=B9=DF=BC=DB=B1=E23=B0=B3(=C1=A4=C7=B02=B0=B3,=B5=A5=B8=
=F01=B0=B3)
=A2=BD=C0=CC=B8=E1=B8=AE=BD=BA=C6=AE100=B8=B8=B0=B3 =
=A2=BD=B0=D4=BD=C3=C6=C7=B5=EE=B7=CF=B1=E21=B0=B3 =
=A2=BD=B0=D4=BD=C3=C6=C7DB5000=B0=B3

=A2=D1 =C0=A7=C0=C7 =B8=F0=B5=E7=B0=CD=C0=BB =
20=B8=B8=BF=F8=BF=A1 =B4=D9 =B5=E5=B8=B3=B4=CF=B4=D9. =A2=D0
  =20
=A2=BE=A2=BE=A2=BE =C8=AB=BA=B8=B0=ED=B1=DE=BF=EB(1) =
=A2=BE=A2=BE=A2=BE

=A2=BC=A2=BC=A2=BC=B0=B3=C0=CE =C8=A8=C6=E4=C1=F6=BF=A1 =
=C0=CC=B8=E1=C3=DF=C3=E2,=B9=DF=BC=DB=B1=E2=B8=A6 =C1=F7=C1=A2 =
=BC=B3=C4=A1 =C7=D8 =B5=E5=B8=B3=B4=CF=B4=D9.=A2=BC=A2=BC=A2=BC

=
=A2=BD=C0=CC=B8=E1=C3=DF=C3=E2=B1=E2=B4=C9=A2=BD=C0=CC=B8=E1=C1=DF=BA=B9=BB=
=E8=C1=A6=B1=E2=B4=C9=A2=BD=BC=F6=BD=C5=B0=C5=BA=CE=C0=DA=B5=BF=B1=E2=B4=C9=
=A2=BD=C0=CC=B8=E1=B9=DF=BC=DB=B1=E2=B4=C9
=
=A2=BD=BC=F6=BD=C5=B0=C5=BA=CE=C0=DA=C0=D3=BD=C3=BA=B8=B3=BB=B1=E2=A2=BD=C0=
=D3=BD=C3=B0=C5=BA=CE=C0=DA=BC=F6=BD=C5=B0=C5=BA=CE=C0=DA=B7=CE

=
=A2=D1=BC=B3=C4=A1=B0=A1=B4=C9=C7=D1=B0=F7=3D=C8=A8=C6=E4=C1=F6=BF=A1MYSQ=
L=B0=E8=C1=A4=C0=CC =C0=D6=BE=EE=BE=DF=C7=D4
=C0=AF=B7=E1=C8=A8=C0=CC =BE=F8=B4=C2=B0=E6=BF=EC=B4=C2 =
(200=B8=DE=B0=A1,=C0=CF=B3=E2=C8=A3=BD=BA=C6=C34.4000=BF=F8=BA=B0=B5=B5=C0=
=D3)

=A2=D1 =C0=A7=C0=C7 =BC=B3=C4=A1=B8=A6 20=B8=B8=BF=F8=BF=A1 =
=C7=D8=B5=E5=B8=B3=B4=CF=B4=D9. =A2=D0
  =20
=A2=BE=A2=BE=A2=BE =C8=AB=BA=B8=B0=ED=B1=DE=BF=EB(2) =
=A2=BE=A2=BE=A2=BE

1000=B8=B8=B0=B3 =C0=CC=B8=E1=B8=AE=BD=BA=C6=AE=B8=A6 =
=BF=C3=B8=B0=BC=AD=B9=F6=B8=A6 =
=B8=EE=BB=E7=B6=F7=BF=A1=B0=D4=B8=B8=C0=D3=B4=EB=C7=D4=B4=CF=B4=D9.
=
(=B1=E2=B0=A31=B3=E2=3D=B0=A1=B0=DD100=B8=B8=BF=F8)=C8=AB=BA=B8=C7=C1=B7=CE=
=B1=D7=B7=A5=B0=FA =B8=F0=B5=E7 =B3=EB=C7=CF=BF=EC=B8=A6 =C0=FC=BA=CE =
=C0=FC=BC=F6 =C7=D4=B4=CF=B4=D9.
  =20
=A2=C2=A2=C2=A2=C2 =C0=CC=B8=E1 =B1=A4=B0=ED =B4=EB=C7=E0 =
=A2=C2=A2=C2=A2=C2

=B1=D7=B5=BF=BE=C8 =C8=AB=BA=B8=C0=C7 =B3=EB=C7=CF=BF=EC=B7=CE =
2=B3=E2=BF=A1 =B0=C9=C3=C4 =B9=DF=BC=DB=BD=C3=BC=B3=C0=BB =BF=CF=BA=F1 =
=C7=CF=B0=ED
6000=B8=B8=B0=B3=C0=C7 =C0=CC=B8=E1=B5=A5=C0=CC=C5=B8=B8=A6 =
=B1=B8=BA=F1=C7=CF=BF=A9 =C0=CC=B8=E1=C8=AB=BA=B8=B8=A6 =
=B4=EB=C7=E0=C7=D8 =B5=E5=B8=B3=B4=CF=B4=D9.


Re: detecting outliers in NON normal data ?

2002-02-27 Thread DELOMBA

What about Hat Matrix ? Mahalanobis distance ?

Yves


Voltolini [EMAIL PROTECTED] wrote in message
00f301c1be68$13413000$fde9e3c8@oemcomputer">news:00f301c1be68$13413000$fde9e3c8@oemcomputer...
 Hi,

 I would like to know if methods for detecting outliers
 using interquartil ranges are indicated for data with
 NON normal distribution.

 The software Statistica presents this method:
 data point value  UBV + o.c.*(UBV - LBV)
 data point value  LBV - o.c.*(UBV - LBV)

 where: UBV is the 75th percentile) and LBV is the 25th percentile).  o.c.
is
 the outlier coefficient.

 In the biological world many data are not normally distributed and tests
 like Rosner, Dixon and Grubbs (if I am wright ! ) are good just for
normally
 distributed data.


 Does anyone can help me ?


 Thanks..



 _
 Prof. J. C. Voltolini
 Grupo de Estudos em Ecologia de Mamiferos - ECOMAM
 Universidade de Taubate - Depto. Biologia
 Praca Marcellino Monteiro 63, Bom Conselho,
 Taubate, SP - BRASIL. 12030-010

 TEL: 0XX12-2254165 (lab.), 2254277 (depto.)
 FAX: 0XX12-2322947
 E-Mail: [EMAIL PROTECTED]
 http://www.mundobio.rg3.net/
 



 =
 Instructions for joining and leaving this list, remarks about the
 problem of INAPPROPRIATE MESSAGES, and archives are available at
   http://jse.stat.ncsu.edu/
 =




=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



Save Money this Month on Toner Cartridges!!

2002-02-27 Thread suhousede89365748






D  J Printing Corporation
2564 Cochise Drive
Acworth, GA 30102
(V)770-974-8228
(F)770-974-7223
[EMAIL PROTECTED]



 --LASER, FAX AND COPIER PRINTER TONER CARTRIDGES--

*WE ACCEPT GOVERNMENT, SCHOOL AND UNIVERSITY PURCHASE ORDERS*

  
***FREE SHIPPING WITH ANY ORDER OF $200 OR MORE!!!***

APPLE

  LASER WRITER SELECT 300/310/360   $60  
  LASER WRITER PRO 600/630 OR 16/600$60 
  LASER WRITER 300/320 OR 4/600 $45   
  LASER WRITER LS/NT/NTR/SC $50
  LASER WRITER 2NT/2NTX/2SC/2F/2G   $50
  LASER WRITER 12/640$60
   
HEWLETT PACKARD

  LASERJET SERIES 1200 (C7115A) $40
  LASERJET SERIES 4100X/4100A (C8061A/X)$99 
  LASERJET SERIES 1100/1100A (C4092A)   $40
  LASERJET SERIES 2100/SE/XI/M/TN (C4096A)  $70
  LASERJET SERIES 2/2D/3/3D (92295A)$43
  LASERJET SERIES 2P/2P+/3P (92275A)$55 
  LASERJET SERIES 3SI/4SI   (92291A)$75  
  LASERJET SERIES 4/4M/4+/4M+/5/5M/5N (92298A/X)$55  
  LASERJET SERIES 4L/4ML/4P/4MP (92274A)$40  
  LASERJET SERIES 4000/T/N/TN  (C4127A/X-H YLD) $70
  LASERJET SERIES 4V/4MV (C3900A)   $80
 
  LASERJET SERIES 5000 (C4129X)$95 
 
  LASERJET SERIES 5L/6L (C3906A)$39
  LASERJET SERIES 5P/5MP/6P/6MP (C3903A)$50
  LASERJET SERIES 5SI/5SI MX/5SI MOPIER/8000(C3909A/X)  $80
  LASERJET SERIES 8100/N/DN ((C4182X)   $100


HEWLETT PACKARD LASERFAX

  LASERFAX 500/700, FX1 $50  
  LASERFAX 5000/7000, FX2   $65
  LASERFAX FX3  $60 
  LASERFAX FX4  $65 

LEXMARK
  
  E312L, E310 (13T0101) $60

  OPTRA 4019, 4029 HIGH YIELD   $130   
  OPTRA R, 4039, 4049 HIGH YIELD$125   
  OPTRA S, 4059 HIGH YIELD  $135   
  
  OPTRA N   $100   

  OPTRA T 610/612/614   $185


EPSON LASER TONER

  EPL-7000/7500/8000$95
   
  EPL-1000/1500 $95


EPSON INK JET

  STYLUS COLOR 440/640/740/760/860 (COLOR)   $20

  STYLUS COLOR 740/760/860  (BLACK)  $20


CANON
  LBP-430   $45  
  LBP-460/465 $55   
  LBP-8 II  $50 
  LBP-LX$54 
  LBP-NX$90 
  LBP-AX$49 
  LBP-EX$59 
  LBP-SX$49 
  LBP-BX$90 
  LBP-PX$49 
  LBP-WX$90 
  LBP-VX$59 


  CANON FAX L700 THRU L790 (FX1)$55 
  CANON FAX L5000 THRU L7500 (FX2)  $65 
  CANON LASERCLASS 4000/4500/300 (FX3)  $60
  CANON LASERCLASS 8500 THRU 9800 (FX4) $65

CANON COPIERS

  PC 1/2/3/6/6RE/7/8/11/12/65 (A30) $69 
  PC 210 THRU 780 (E40/E31)  $80   
 
  PC 300/400 (E20/E16)  $80

NEC

  SERIES 2 LASER MODEL 90/95$100  
  
***FREE SHIPPING WITH ANY ORDER OF $200 OR MORE!!!***

PLEASE NOTE:

 * ALL OF OUR PRICES ARE IN US DOLLARS
 * WE SHIP UPS GROUND.  ADD $6.50 FOR SHIPPING AND HANDLING
 * WE ACCEPT ALL MAJOR CREDIT CARDS OR COD ORDERS.
 * COD CHECK ORDERS ADD $3.50 TO YOUR SHIPPING COST.   
 * OUR STANDARD MERCHANDISE REPLACEMENT POLICY IS NET 90 DAYS.
 * WE DO NOT SELL TO RESELLERS OR BUY FROM DISTRIBUTERS.
 * WE DO NOT CARRY: BROTHER, MINOLTA, KYOSERA, PANASONIC, XEROX, 
FUJITSU, OKIDATA OR SHARP PRODUCTS. 
 * WE ALSO DO NOT CARRY:  DESKJET OR BUBBLEJET SUPPLIES.
 * WE DO NOT BUY FROM OR SELL TO RECYCLERS OR REMANUFACTURERS.

   -PLACE YOUR ORDER AS FOLLOWS-

1) BY PHONE (770) 974-8228
2) BY MAIL:  D AND J PRINTING CORPORATION
 2564 COCHISE DR
 ACWORTH, GA 30102
3) BY INTERNET: [EMAIL PROTECTED]

 

Re: CRIMCOORD transformation in QUEST

2002-02-27 Thread Paul Thompson

That is either a sloppiness in writing or reliance on the relationship 
between eigen decomposition and SVD.

SSM - square symmetric matrix
AM - arbitrary matrix

In ED, SSM = Q E Q'
In SVD, AM = P D Q'

SSM = AM' AM
= Q D P' P D Q' = Q D D Q'
= Q E Q', if E = D D
I haven't checked that above, but it is pretty close to accurate.  You 
may need to throw in a division by n.

David Chang wrote:

Hi, thank you for reading this message. I have the following problems in
getting the correct CRIMCOORD transformation of categorical variables
in QUEST decision tree algorithm. Your help will be greatly appreciated.

Q1: In Loh  Shih's paper (Split Selection Models for Classification
Trees, Statistica Sinica, 1997, vol 7, p815-840), they mentioned about
the mapping from categorical variable to ordered variable via CRIMCOORD.
But, their explanation, in particular, step 5 of algorithm 2 is not
clear. For example, they wrote Perform a singular value decomposition
of the matrix GFU and let a (vector) be the eigenvector (of what?)
associated with the largest eigenvalue in step 5. Does this mean
a(vector) is the eigenvector of transpose(GFU)*GFU?

Q2.
I tried to verify the data sets in Table 1. Data set I-III are OK. But,
the result for data set IV seems to be incorrect. Could any one of you
help me verify that?

Thank you very much for your help !!

David




=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



EDSTAT list

2002-02-27 Thread E. Jacquelin Dietz

Dear EdStat readers,

During the next week (probably Friday or Monday), the EdStat list will
move to a new server.  At that time, we will also start using a new
version of the Majordomo software.  We hope that these changes will
reduce the amount of spam sent to the list.  In addition, the threat of
viruses will be reduced because attachments will no longer be
allowed.

We hope the transition will be smooth, and we will try to keep you
informed as changes are implemented.  If problems arise, please be
patient and check the web page http://jse.stat.ncsu.edu for information.

Jackie Dietz
Listowner
-- 

  E. Jacquelin Dietz   (919) 515-1929  (phone) 
  Department of Statistics, Box 8203   (919) 515-1169  (FAX) 
  North Carolina State University  
  Raleigh, NC  27695-8203   USA[EMAIL PROTECTED] 

  Street address for FedEx:  
  Room 210E Patterson Hall, 2501 Founders Drive



=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



Applied analysis question

2002-02-27 Thread Brad Anderson

I have a continuous response variable that ranges from 0 to 750.  I
only have 90 observations and 26 are at the lower limit of 0, which is
the modal category.  The mean is about 60 and the median is 3; the
distribution is highly skewed, extremely kurtotic, etc.  Obviously,
none of the power transformations are especially useful.  The product
moment correlation between the response and the primary covariate is
near zero, however, a rank-order correlation coefficient is about .3
and is signficant.  We have 5 additional control variables.  I'm
convinced that any attempt to model the conditional mean response is
completely inappropriate, yet all of the alternatives appear flawed as
well.  Here's what I've done:

I've collapsed the outcome into 3- and 4- category ordered response
variables and estimated ordered logit models.  I dichotomized the
response (any vs none) and estimated binomial logit.  All of these
approaches yield substantively consistent results using both the model
based standard errors and the Huber-White sandwich robust standard
errors.  My concerns about this approach are 1) the somewhat arbitrary
classification restricts the observed variability, and 2) the
estimators assume large sample sizes.

I rank transformed the response variable and estimated a robust
regression (using the rreg procedure in Stata)--results were
consistent with those obtained for the ordered and binomial logit
models described above.  I know that Stokes, Davis, and Koch have
presented procedures to estimate analysis of covariance on ranks, but
I've not seen reference to the use of rank transformed response
variables in a regression context.

A plot of the rank-transformed response with the primary covariate
clearly suggests a meaningful pattern.  Contingency table analysis
with a collapsed covariate strongly suggest a meaningful pattern.  But
I'm at something of a loss to know the best way to analyze and report
the results.  Thanks in advance.


=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



Re: Applied analysis question

2002-02-27 Thread Dennis Roberts

At 04:11 PM 2/27/02 -0500, Rich Ulrich wrote:

Categorizing the values into a few categories labeled,
none, almost none,   is one way to convert your scores.
If those labels do make sense.

well, if 750 has the same numerical sort of meaning as 0 (unit wise) ... in 
terms of what is being measured then i would personally not think so SINCE, 
the categories above 0 will encompass very wide ranges of possible values

if the scale was # of emails you look at in a day ... and 1/3 said none or 
0 ... we could rename the scale 0 = not any, 1 to 50 as = some, and 51 to 
750 as = many (and recode as 1, 2, and 3) .. i don't think anyone who just 
saw the labels ... and were then asked to give some extemporaneous 'values' 
for each of the categories ... would have any clue what to put in for the 
some and many categories ... but i would predict they would seriously 
UNderestimate the values compared to the ACTUAL responses

this just highlights that for some scales, we have almost no 
differentiation at one end where they pile up ... perhaps (not saying one 
could have in this case) we could have anticipated this ahead of time and 
put scale categories that might have anticipated that

after the fact, we are more or less dead ducks

i would say this though ... treating the data only in terms of ranks ... 
does not really solve anything ... and clearly represents being able to say 
LESS about your data or interrelationships (even if the rank order r is .3 
compared to the regular pearson of about 0) ... than if you did not resort 
to only thinking about the data in rank terms




--
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
   http://jse.stat.ncsu.edu/
=

Dennis Roberts, 208 Cedar Bldg., University Park PA 16802
Emailto: [EMAIL PROTECTED]
WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm
AC 8148632401



=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



Re: Means of semantic differential scales

2002-02-27 Thread Dennis Roberts

At 01:39 PM 2/27/02 -0600, Jay Warner wrote:

  
  Not stressful 1__ 2__ 3__ 4__ 5__ 6__ 7__ Very stressful

just out of curiosity ... how many consider the above to be an example of a 
bipolar scale?

i don't

now, if we had an item like:

sad  happy
1  . 7

THEN the mid point becomes much more problematic ...

since being a 4 ... is neither a downer nor upper

now, a quick search found info from ncs about the 16pf personality scale 
... it shows 16 BIpolar dimensions as:

Bipolar Dimensions of Personality
Factor A Warmth (Cool vs Warm)
Factor B Intelligence (Concrete Thinking vs Abstract Thinking)
Factor C Emotional Stability (Easily Upset vs Calm)
Factor E Dominance (Not Assertive vs Dominant)
Factor F Impulsiveness (Sober vs Enthusiastic)
Factor G Conformity (Expedient vs Conscientious)
Factor H Boldness (Shy vs Venturesome)
Factor I Sensitivity (Tough-Minded vs Sensitive)
Factor L Suspiciousness (Trusting vs Suspicious)
Factor M Imagination (Practical vs Imaginative)
Factor N Shrewdness (Forthright vs Shrewd)
Factor O Insecurity (Self-Assured vs Self-Doubting)
Factor Q1 Radicalism (Conservative vs Experimenting)
Factor Q2 Self-Sufficiency (Group-Oriented vs Self-Sufficient)
Factor Q3 Self-Discipline (Undisciplined vs Self-Disciplined)
Factor Q4 Tension (Relaxed vs Tense)

let's take the one ... shy versus venturesome ...

now, we could make a venturesome scale by itself ...

0 venturesomeness .. (up to)  very venturesome 7

does 0 = shy seems like if the answer is no ... then we might have a 
bipolar scale ... if the answer is yes ... then we don't



  It could be the use of the particular bipolars
  not stressful and very stressful.
=

Dennis Roberts, 208 Cedar Bldg., University Park PA 16802
Emailto: [EMAIL PROTECTED]
WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm
AC 8148632401



=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



Re: Statistics Tool For Classification/Clustering

2002-02-27 Thread Mark Harrison

Good places to start:

Optimal feature extractors, that's better than PCA because you whiten your
inter class scatter and so put all inter class comparisons on the same
level. The good thing is this will also reduce your feature vector
dimensionality to c-1 (where c is # classes). PCA will not do this.

Check the stats of each class, is it Gaussian or known pdf? Apply
parameteric classifier if so.

However you are lucky if you get good classification after this, so you will
probably need non linear, non parametric classifiers. Try K nearest
neighobour, but that might take the age of the Universe so use a condensing
algorithm first to get a smaller representative set.

Matlab is what I use for coding, there are a lot of free toolboxes around.
Mostly I write my own though.

Best wishes

Andrew


Rishabh Gupta [EMAIL PROTECTED] wrote in message
news:a4eje9$ip8$[EMAIL PROTECTED];
 Hi All,
 I'm a research student at the Department Of Electronics, University Of
 York, UK. I'm working a project related to music analysis and
 classification. I am at the stage where I perform some analysis on music
 files (currently only in MIDI format) and extract about 500 variables that
 are related to music properties like pitch, rhythm, polyphony and volume.
I
 am performing basic analysis like mean and standard deviation but then I
 also perform more elaborate analysis like measuring complexity of melody
and
 rhythm.

 The aim is that the variables obtained can be used to perform a number of
 different operations.
 - The variables can be used to classify / categorise each piece of
 music, on its own, in terms of some meta classifier (e.g. rock, pop,
 classical).
 - The variables can be used to perform comparison between two files. A
 variable from one music file can be compared to the equivalent variable in
 the other music file. By comparing all the variables in one file with the
 equivalent variable in the other file, an overall similarity measurement
can
 be obtained.

 The next stage is to test the ability of the of the variables obtained to
 perform the classification / comparison. I need to identify variables that
 are redundant (redundant in the sense of 'they do not provide any
 information' and 'they provide the same information as the other
variable')
 so that they can be removed and I need to identify variables that are
 distinguishing (provide the most amount of information).

 My Basic Questions Are:
 - What are the best statistical techniques / methods that should be
 applied here. E.g. I have looked at Principal Component Analysis; this
would
 be a good method to remove the redundant variables and hence reduce some
the
 amount of data that needs to be processed. Can anyone suggest any other
 sensible statistical anaysis methods?
 - What are the ideal tools / software to perform the clustering /
 classification. I have access to SPSS software but I have never used it
 before and am not really sure how to apply it or whether it is any good
when
 dealing with 100s of variables.

 So far I have been analysing each variable on its own 'by eye' by plotting
 the mean and sd for all music files. However this approach is not feasible
 in the long term since I am dealing with such a large number of variables.
 In addition, by looking at each variable on its own, I do not find
clusters
 / patterns that are only visible through multivariate analysis. If anyone
 can recommend a better approach I would be greatly appreciated.

 Any help or suggestion that can be offered will be greatly appreciated.

 Many Thanks!

 Rishabh Gupta






=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



Re: Statistics Tool For Classification/Clustering

2002-02-27 Thread Mark Harrison

Corection typo: Should read 'Whiten intra class scatter'

Mark Harrison [EMAIL PROTECTED] wrote in message
news:FIif8.16518$[EMAIL PROTECTED];
 Good places to start:

 Optimal feature extractors, that's better than PCA because you whiten your
 inter class scatter and so put all inter class comparisons on the same
 level. The good thing is this will also reduce your feature vector
 dimensionality to c-1 (where c is # classes). PCA will not do this.

 Check the stats of each class, is it Gaussian or known pdf? Apply
 parameteric classifier if so.

 However you are lucky if you get good classification after this, so you
will
 probably need non linear, non parametric classifiers. Try K nearest
 neighobour, but that might take the age of the Universe so use a
condensing
 algorithm first to get a smaller representative set.

 Matlab is what I use for coding, there are a lot of free toolboxes around.
 Mostly I write my own though.

 Best wishes

 Andrew


 Rishabh Gupta [EMAIL PROTECTED] wrote in message
 news:a4eje9$ip8$[EMAIL PROTECTED];
  Hi All,
  I'm a research student at the Department Of Electronics, University
Of
  York, UK. I'm working a project related to music analysis and
  classification. I am at the stage where I perform some analysis on music
  files (currently only in MIDI format) and extract about 500 variables
that
  are related to music properties like pitch, rhythm, polyphony and
volume.
 I
  am performing basic analysis like mean and standard deviation but then I
  also perform more elaborate analysis like measuring complexity of melody
 and
  rhythm.
 
  The aim is that the variables obtained can be used to perform a number
of
  different operations.
  - The variables can be used to classify / categorise each piece of
  music, on its own, in terms of some meta classifier (e.g. rock, pop,
  classical).
  - The variables can be used to perform comparison between two files.
A
  variable from one music file can be compared to the equivalent variable
in
  the other music file. By comparing all the variables in one file with
the
  equivalent variable in the other file, an overall similarity measurement
 can
  be obtained.
 
  The next stage is to test the ability of the of the variables obtained
to
  perform the classification / comparison. I need to identify variables
that
  are redundant (redundant in the sense of 'they do not provide any
  information' and 'they provide the same information as the other
 variable')
  so that they can be removed and I need to identify variables that are
  distinguishing (provide the most amount of information).
 
  My Basic Questions Are:
  - What are the best statistical techniques / methods that should be
  applied here. E.g. I have looked at Principal Component Analysis; this
 would
  be a good method to remove the redundant variables and hence reduce some
 the
  amount of data that needs to be processed. Can anyone suggest any other
  sensible statistical anaysis methods?
  - What are the ideal tools / software to perform the clustering /
  classification. I have access to SPSS software but I have never used it
  before and am not really sure how to apply it or whether it is any good
 when
  dealing with 100s of variables.
 
  So far I have been analysing each variable on its own 'by eye' by
plotting
  the mean and sd for all music files. However this approach is not
feasible
  in the long term since I am dealing with such a large number of
variables.
  In addition, by looking at each variable on its own, I do not find
 clusters
  / patterns that are only visible through multivariate analysis. If
anyone
  can recommend a better approach I would be greatly appreciated.
 
  Any help or suggestion that can be offered will be greatly appreciated.
 
  Many Thanks!
 
  Rishabh Gupta
 
 






=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=



Re: Applied analysis question

2002-02-27 Thread Rolf Dalin

Brad Anderson wrote:

 I have a continuous response variable that ranges from 0 to 750.  I only
 have 90 observations and 26 are at the lower limit of 0, 

What if you treated the information collected by that variable as really
two variables, one categorical variable indicating zero or non-zero value.
Then the remaining numerical variable could only be analyzed conditionally
on the category was non-zero.

In many cases when you collect data on consumers consumption of 
some commodity, you would end up in a big number of them not 
using the product at all, while those who used the product would 
consume different amounts.

Rolf Dalin
**
Rolf Dalin
Department of Information Tchnology and Media
Mid Sweden University
S-870 51 SUNDSVALL
Sweden
Phone: 060 148690, international: +46 60 148690
Fax: 060 148970, international: +46 60 148970
Mobile: 0705 947896, intnational: +46 70 5947896

mailto:[EMAIL PROTECTED]
http://www.itk.mh.se/~roldal/
**


=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=